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This thesis is a contribution to the debate on the imphcations of quantum information 
theory for the foundational problems of quantum mechanics. 

In Part I an attempt is made to shed some light on the nature of information and 
quantum information theory. It is emphasized that the everyday notion of information 
is to be firmly distinguished from the technical notions arising in information theory; 
however it is maintained that in both settings 'information' functions as an abstract 
noun, hence does not refer to a particular or substance. The popular claim 'Information 
is Physical' is assessed and it is argued that this proposition faces a destructive dilemma. 
Accordingly, the slogan may not be understood as an ontological claim, but at best, as 
a methodological one. A novel argument is provided against Dretske's (1981) attempt 
to base a semantic notion of information on ideas from information theory. 

The function of various measures of information content for quantum systems is ex- 
plored and the applicability of the Shannon information in the quantum context main- 
tained against the challenge of Brukner and Zeilinger (2001). The phenomenon of quan- 
tum teleportation is then explored as a case study serving to emphasize the value of 
recognising the logical status of 'information' as an abstract noun: it is argued that the 
conceptual puzzles often associated with this phenomenon result from the familiar error 
of hypostatizing an abstract noun. 

The approach of Deutsch and Hayden (2000) to the questions of locality and infor- 
mation flow in entangled quantum systems is assessed. It is suggested that the approach 
suffers from an equivocation between a conservative and an ontological reading; and the 
differing implications of each is examined. Some results are presented on the character- 
ization of entanglement in the Deutsch-Hayden formalism. 

Part I closes with a discussion of some philosophical aspects of quantum computation. 
In particular, it is argued against Deutsch that the Church- Turing hypothesis is not 
underwritten by a physical principle, the Turing Principle. Some general morals are 
drawn concerning the nature of quantum information theory. 

In Part II, attention turns to the question of the implications of quantum information 
theory for our understanding of the meaning of the quantum formalism. Following some 
preliminary remarks, two particular information-theoretic approaches to the foundations 
of quantum mechanics are assessed in detail. It is argued that Zeilinger's (1999) Founda- 
tional Principle is unsuccessful as a foundational principle for quantum mechanics. The 
information-theoretic characterization theorem of Clifton, Bub and Halvorson (2003) 
is assessed more favourably, but the generality of the approach is questioned and it is 
argued that the implications of the theorem for the traditional foundational problems 
in quantum mechanics remains obscure. 
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Introduction 



Much is currently made of the concept of information in physics, following the rapid 
growth of the fields of quantum information theory and quantum computation. These 
are new and exciting fields of physics whose interests for those concerned with the foun- 
dations and conceptual status of quantum mechanics are manifold. On the experimental 
side, the focus on the ability to manipulate and control individual quantum systems, 
both for computational and cryptographic purposes, has led not only to detailed re- 
alisation o^jnaiiv of the gedanken-experunents familiar from foundational discussions 



(see e.g. 



Zeilinge 



quantum world ijBoschi et al 



1999a^'l. but also to wholly new demonstrat i ons of the oddity of th e 



1995; 



Bouwmeester et al 



1997 



Furusawa et al 



1998). 



Developments on the theoretical side are no less important and interesting. Concentra- 
tion on the possible ways of using the distinctively quantum mechanical properties of 
systems for the purposes of carrying and processing information has led to considerable 
deepening of our understanding of quantum theory. The study of the phenomenon of 
entanglement, for ex ample, has co me on in leaps and bounds under the aegis of quantum 



information (see e.g. 



BrussI l|2002|) for a review of recent developments) . 



The excitement surrounding these fields is not solely due to the advances in the 
physics, however. It is due also to the seductive power of some more overtly philosophical 
(indeed, controversial) theses. There is a feeling that the advent of quantum information 
theory heralds a new way of doing physics and supports the view that information should 
play a more central role in our world picture. In its extreme form, the thought is that 
information is perhaps the fundamental category from which all else flows (a view with 
obvious affinities to idealism)^, and that the new task of physics is to discover and 

^Consider, for example, Wheeler's infamous 'It from Bit' proposal, the idea that every physical thing 
(every 'it') derives its existence from the answer to yes-no questions posed by measuring devices: 'No 



Ul 
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describe how this information evolves, manifests itself and can be manipulated. Less 
extravagantly, we have the ubiquitous, but baffling, claim that 'Information is Physical' 



I Landauer 



199fi() and the widespread hope that quantum information theory will have 
something to tell us about the still vexed questions of the interpretation of quantum 
mechanics. 

These claims are ripe for philosophical analysis. To begin with, it seems that the 
seductiveness of such thoughts appears to stem, at least in part, from a confusion between 
two senses of the term 'information' which must be distinguished: 'information' as a 
technical term which can have a legitimate place in a purely physical language, and 
the everyday concept of information associated with knowledge, language and meaning, 
which is completely distinct and about which, I shall suggest, physics has nothing to 
say. The claim that information is physical is baffling, because the everyday concept of 
information is reliant on that of a person who might read or understand it, encode or 
decode it, and makes sense only within a framework of language and language users; 
yet it is by no means clear that such a setting may be reduced to purely physical 
terms; while the mere claim that some physically defined quantity (information in the 
technical sense) is physical would seem of little interest. The conviction that quantum 
information theory will have something to tell us about the interpretation of quantum 
mechanics seems natural when we consider that the measurement problem is in many 
ways the central interpretive problem in quantum mechanics and that measurement is 
a transfer of information, an attempt to gain knowledge. But this seeming naturalness 
only rests on a confusion between the two meanings of 'information'. 

My aim in this thesis is to clarify some of the issues raised here. In Part I, I attempt 
to shed some light on the question of the nature of information and quantum information 
theory, emphasising in particular the distinction between the technical and non-technical 
notions of information; in Part II, I turn to consider, in light of the preceding discussion, 
the question of what role the concept of information, and quantum information theory 

element in the description of physics shows itself as closer to primordial than the elementary quantum 
phenomenon. ..in brief, the elementary act of observer participancy. . . It from bit symbolizes the idea 
that every item of the physical world has at bottom — at a very deep bottom, in most instances — an 
immaterial source and explanation; that which we call reality arises in the last analysis from the posing 
of yes-no questions that are the registering of equipment evoked responses; in short that all things 
physical are information-theoretic in origin and this is a participatory universe. ' JWheeleii [199(1 p.3,5) 
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in particular, might have to play in the foundations of quantum mechanics. What 
foundational implications might quantum information theory have? 

In Chapter^I begin by describing some features of the everyday notion of information 
and indicate the lines of distinc tion from the technical notion of information deriving 



from the work of 



ShannonI Ijl948^ : I also highlight the important point that 'information' 
is an abstract noun. Some of the distinctive ideas of quantum information theory are then 
introduced, before I turn to consider the dilemma that faces the slogan 'Information is 
Physical'. The claim that the everyday and information- theoretic notion s of information 
are to be kept distinct is defended against the view of iDretske who sought to 



base a semantic notion of information on Shannon's theory. I present a novel argument 
against Dretske's position. 

One of the more prominent proposals that seeks to esta blish a link betwe en informa- 
tion and the foundations of quantum mechanics is due to IZeilingen l|1999b|) . who puts 



forward an informa tion-theoretic foundational pr inciple for quantum mechanics. As a 



part of this project. 



Brukner and Zeilingen l|200l|) have criticised Shannon's measure of 



information, the quantity fundamental to the discussion of information in both classical 
and quantum information theory. I address these arguments in Chapter |5] and show 
their worries to be groundless. En passant the function of various notions of informa- 
tion content and total information content for quantum systems, including measures of 
mixedness, is investigated. 

Chapter|3|is a case study whose purpose is to illustrate the value of recognising clearly 
the logico-grammatical status of the term 'information' as an abstract noun: in this 
chapter I investigate the phenomenon of quantum teleportation. While teleportation is 
a straightforward consequence of the formalism of non-relativistic quantum mechanics, it 
has nonetheless given rise to a good deal of conceptual puzzlement. I illustrate how these 
puzzles generally arise from neglecting the fact that 'information' is an abstract noim. 
When one recognises that 'the information' does not refer to a particular or to some 
sort of pseudo-substance, any puzzles are quickly dispelled. One should not be seeking, 
in an information-theoretic protocol — quantum or otherwise — for some particular 'the 
information', whose path one is to follow, but rather concentrating on the physical 
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processes by which the information is transmitted, that is, by which the end resuh of 
the protocol is brought about. When we bear this in mind for teleportation, we see that 
the only remaining source for dispute over the protocol is the quotidian one regarding 
what interpretation of quantum mechanics one wishes to adopt. 

Chapter IH continues some of the themes from th e preceding chapter. In it I discuss 
the important paper of iDeutsch and HavdenI pOOfl) . which would appear to have sig- 



nificant implications for the nature and location of quantum information: Deutsch and 
Hayden claim to have provided an account of quantum mechanics which is particularly 
local, and which finally clarifies the nature of information fiow in entangled quantum 
systems. I provide a perspicuous description of their formalism and assess these claims. 
It proves essential to distinguish, as Deutsch and Hayden do not, between two ways of 
interpreting their formalism. On the first, conservative, interpretation, no benefits with 
respect to locality accrue that are not already available on either an Everettian or a 
statistical interpretation; and the conclusions regarding information flow are equivocal. 
The second, ontological interpretation, offers a framework with the novel feature that 
global properties of quantum systems are reduced to local ones; but no conclusions follow 
concerning information flow in more standard quantum mechanics. 

In Chapter |5l I investigate the characterization of bi-partite entanglement in the 
Deutsch-Hayden formalism. The case of pure state entanglement is, as one would expect, 
straightforward; more intere sting is mixed sta t e entan glement. The Horodecki's positive 



partial transpose condition iHorodecki et al 



1996a^ provides necessary and sufficient 



conditions in this case for 2 2 and 2 3 dimensional systems, but it remains an 
interesting question how their condition may be understood in the geometrical setting 
of the Deutsch-Hayden formalism. I provide some sufficient conditions for mixed state 
entanglement which may be formulated in a simple geometrical way and provide some 
concrete illustrations of how the partial transpose operation can be seen to function 
from the point of view of the Deutsch-Hayden formalism. 

Chapteriniis a discussion of some of the philosophical questions raised by the theory of 
quantum computation. First I consider whether the possibility of exponential speed-up 
in quantum computation provides an argument for a more substantive notion of quantum 
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information than I have previously aUowed, concluding in the negative, before moving 
on to consider some questions regarding the status of the Church- Turing hypothesis in 
the light of quantum computation. In particular, I argue against Deutsch's claim that 
a physical principle, the Turing Principle, underlies the Church- Turing hypothesis; and 
consider briefly the question of whether the Church- Turing hypothesis might serve as a 
constraint on the laws of physics. 

Chapter 13 brings together some morals from Part I. 

Part II begins with Chapter |H1 wherein I outline some preliminary considerations 
that are pertinent when assessing approaches to the foundational questions in quantum 
mechanics that appeal to information. One point noted is that if all that appeal to 
information were to signify in a given approach is the advocacy of an instrumentalist 
view, then we are not left with a very interesting, or at least, not a very distinctive, 
position. 

The most prominent lines of research engaged in bringing out implications of quan- 
tum information theory for the foundations of quantum mechanics have been concerned 
with establishing whether information-theoretic ideas might finally provide a perspicu- 
ous conceptual basis for quantum mechanics, perhaps by suggesting an axiomatisation 
of the theory that lays our interminable worrying to rest. That one might hope t o make 
progre ss in this direction is a thought that has been advocated persuasively by 



Fuchi 



1 20031) . for example. In the flnal chapter, I investigate some proposals in this vein. 



in particular, Zeilinger's Foundational Principle an d the information-th eoretic charac- 



2003 1. I show that 



terization theorem of Clifton, Bub and Halvorson ijClifton et al. 
Zeilinger's Foundational Principle {'An elementary system represents the truth value of 
one proposition^) does not in fact provide a foundational principle for quatum mechanics 
and fails to underwrite explanations of the irreducible randomness of quantum measure- 
ment and the existence of entanglement, as Zeilinger had hoped. The assessment of the 
theorem of Clifton, Bub and Halvorson is more positive: here indeed an axiomatisation 
of quantum mechanics has been achieved. However, I raise some questions concern- 
ing the C*-algebraic starting point of the theorem and argue that it remains obscure 
what implications for the standard interpretational questions of quantum mechanics this 
axiomatisation might have. 



Part I 

What is Information? 



1 



2 



To suppose that, whenever we use a singular substantive, we are, or ought to 
be, using it to refer to something, is an ancient, but no longer a respectable. 



error. 



IStrawsonl 



Chapter 1 

Concepts of Information 



1.1 How to talk about information: Some simple ways 

The epigraph to this Part is drawn from Strawson's contribution to his famous 1950 sym- 
posium with Austin on truth. Austin's point of departure in that symposium provides 
also a suitable point of departure for us, concerned as we are with information. 

Austin's aim was to de-mystify the concept of truth, and make it amenable to dis- 
cussion, by pointing to the fact that 'truth' is an abstract noun. So too is 'information'. 
This fact will be of recurrent interest in the first part of this thesis. 

" 'What is truth?' said jesting Pilate, and would not stay for an answer." Said 
Austin: "Pilate was in advance of his time." 

As with truth, so with^ information: 

For 'truth' ['information'] itself is an abstract noun, a camel, that is of a 
logical construction, which cannot get past the eye even of a grammarian. 

We approach it cap and categories in hand: we ask ourselves whether Truth 
[Information] is a substance (the Truth [the information] , the Body of 
Knowledge), or a quality (something like the colour red, inhering in truths 
[in messages]), or a relation ('correspondence' ['correlation']). 

But philosophers should take something more nearly their own size to strain 
at. What needs dis cussing rathe r is the use, or certain uses, of the word 
'true' ['inform']. l)AustinL Il95^ p. 149) 

A characteristic feature of abstract nouns is that they do not serve to denote kinds 
of entities having a location in space and time. An abstract noun may be either a count 

^Due apologies to Austin. 



3 
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noun (a noun which may combine with the indefinite article and form a plural) or a mass 
noun (one which may not). 'Information' is an abstract mass noun, so may usefully be 
contrasted with a concrete mass noun such as 'water'; and with an abstract count noun 
such as 'number'^. Very often, abstract nouns arise as nominalizations of various adjecti- 
val or verbal forms, for reasons of grammatical convenience. Accordingly, their function 
may be explained in terms of the conceptually simpler adjectives or verbs from which 
they derive; thus Austin leads us from the substantive 'truth' to the adjective 'true'. 
Similarly, 'information' is to be explained in terms of the verb 'inform'. Information, we 
might say, is what is provided when somebody is informed of something. If this is to 
be a useful pronouncement, we should be able to explain what it is to inform somebody 
without appeal to phrases like 'to convey information', but this is easily done. To inform 
someone is to bring them to know something (that they did not already know). 

Now, I shall not be seeking to present a comprehensive overview of the different uses 
of the terms 'information' or 'inform', nor to exhibit the feel for philosophically charged 
nuance of an Austin. It will suffice for our purposes merely to focus on some of the 
broadest features of the concept, or rather, concepts, of information. 

The first and most important of these features to note is the distinction between 
the everyday concept of inf ormation and te chnical notions of information, such as that 



deriving from the work of 



Shannon! l)l948|) . The everyday concept of information is 



closely associated with the concepts of knowledge, language and meaning; and it seems, 
furthermore, to be reliant in its central application on the the prior concept of a person 
(or, more broadly, language user) who might, for example, read and understand the 
information; who might use it; who might encode or decode it. 

By contrast, a technical notion of information is specified using a purely mathemat- 
ical and physical vocabulary and, prima facie, will have at most limited and deriviative 
links to semantic and epistemic concepts'^. 

A technical notion of information might be concerned with describing correlations 
and the statistical features of signals, as in communication theory with the Shan- 

An illuminating discussion of mass, count and abstract nouns may be found in iRundld <197fll 
§§27-29). 

^For discussion of Dretske's opposing view, however, see below, Section ll.5l 
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non concept, or it might be concerned with stat i stical inference (e.g 



Knllback and Leibler 


1951 


Savasfe 


1954 


Kullback. 



Fisher 



1925 



Again, a technical notion of 



information might be introduced to ca pture cer t ain abstract notions of structure, such 



as com plexity (algorithmic information, 



Kolmogorov J )_ 



lolomonof] 



Jablonka 




for 



1 19641) '1 or functional role (as in biological information perhaps, cf. 
example^). 

In this thesis our concern is information theory, quantum and classical, so we shall 
concentrate on the best known technical concept of information, the Shannon informa- 
tion, along with some closely related concepts from classical and quantum information 
theory. The technical concepts of these other flayours I mention merely to set to one 
side^. 

With information in the everyday sense, a characteristic use of the term is in phrases 
of the form: 'information about p\ where p might be some object, event, or topic; or 
in phrases of the form: 'information that q\ Such phrases display what is often called 
intentionality. They are directed towards, or are about something (which something 
may, or may not, be present). The feature of intentionality is notoriously resistant to 
subsumption into the bare physical order. 

As I have said, information in the everyday sense is intimately linked to the concept 
of knowledge. Concerning information we can distinguish between possessing informa- 
tion, which is to have knowledge; acquiring information, which is to gain knowledge; and 
containing information, which is sometimes the same as containing knowledge^. Acquir- 
ing information is coming to possess it; and as well as being acquired by asking, reading 
or overhearing, for example, we may acquire information via perception. If something is 
said to contain information then this is because it provides, or may be used to provide, 
knowledge. As we shall presently see, there are at least two importantly distinct ways 

*N.B. To my mind, however, Jablonka overstates the analogies between the technical notion she 
introduces and the everyday concept. 

^Although it will be no surprise that one will often find the same sorts of ideas and mathematical 
expressions cropping up in the context of communication theory as in statistical inference, for exam- 
ple. There are also links between algorithmic information and the Shannon information: the average 
algorithm ic entropy of a thermodynamic ensemble has the same value as the Shannon entropy of the 
ensemble ^enne^EHl)- 

® Containing information and containing knowledge are not always the same: we might, for example 
say that a train timetable contains information, but not knowledge. 



CHAPTER 1. CONCEPTS OF INFORMATION 



in which this may be so. 

It is primarily a person of whom it can be said that they possess information, whilst it 



is obj ects like books, filing cabinets and computers that contain information (cf. 



Hacker 



19871) . In the sense in which my books contain information and knowledge, I do not. 
To contain information in this sense is to be used to store information, expressed in the 
form of propositions^, or in the case of computers, encoded in such a way that the facts, 
figures and so on may be decoded and read as desired. 

On a plausible accou nt of the nature of kn owledge originating with Wittg enstein 



Wittgenstein . 



19^ and 



Kenn\ 



and 



1 19491) . and developed, for example by 



my 



HvmanI l)1999fl . to have knowledge is to possesses a certain 
capacity or ability, rather than to be in some state. On this view, the difference between 
possessing information and containing information can be further elaborated in terms 
of a category distinction: to possess information is to have a certain ability, while for 
something to contain information is for it to be in a certain state (to possess certain 
occurrent categorical pro perties) . We shall not, h owever, pursue t his interesting line of 



analysis further here (see 



Kennv 


(1989 


^ — , — ^ 

, p.108) and 


Timoson 


(2000 



In general, the grounds on which we would say that something contains information, 
and the senses in which it may be said that information is contained, are rather various. 
One important distinction that must be drawn is between containing information propo- 
sitionally and containing information inferentially. If something contains information 
propositionally, then it does so in virtue of a close tie to the expression of propositions. 
For example, the propositions may be written down, as in books, or on the papers in 
the filing cabinet. Or the propositions might be otherwise recorded; perhaps encoded, 
on computers, or on removable disks. The objects said to contain the information in 
these examples are the books, the filing cabinet, the computers, the disks. 

That these objects can be said to contain information about things, derives from 
the fact that the sentences and symbols inscribed or encoded, possess meaning and 
hence themselves can be about, or directed towards something. Sentences and symbols, 
in turn, possess meaning in virtue of their role within a framework of language and 

'^Or perhaps expressed pictorially, also. 
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language users. 

If an object A contains information about in the second sense, however, that 
is, inferentially, then A contains information about B because there exist correlations 
between them that would allow inferences about B from knowledge of A. (A prime 
example would be the thickness of the rings in a tree trunk providing information about 
the severity of past winters.) Here it is the possibility of our use of A, as part of an 
inference providing knowledge, that provides the notion of information ahout^ . And 
note that the concept of knowledge is functioning prior to the concept of containing 
information: as I have said, the concept of information is to be explained in terms of 
the provision of knowledge. 

It is with the notion of containing information, perhaps, that the closest links between 
the everyday notion of information and ideas from communication theory are to be found. 
The technical concepts introduced by Shannon may bo very helpful in describing and 
quantifying any correlations that exist between A and B. But note that describing 
and quantifying correlations does not provide us with a concept of why A may contain 
information (inferentially) about B, in the everyday sense. Information theory can 
describe the facts about the existence and the type of correlations; but to explain why 
A contains information inferentially about B (if it does), we need to refer to facts at 
a different level of description, one that involves the concept of knowledge. A further 
statement is required, to the effect that: 'Because of these correlations, we can learn 
something about B\ Faced with a bare statement: 'Such and such correlations exist', 
we do not have an explanation of why there is any link to information. It is because 
correlations may sometimes be used as part of an inference providing knowledge, that 
we may begin to talk about containing information. 

While I have distinguished possessing information (having knowledge) from contain- 
ing information, there does exist a very strong temptation to try to explain the former 
in terms of the latter. However, caution is required here. We have many metaphors 
that suggest us filing away facts and information in our heads, brains and minds; but 
these are metaphors. If we think the possession of information is to be explained by our 

^Which might be another object, or perhaps an event, or state of affairs. 

^Such inferences may become habitual and in that sense, automatic and un-refiected upon. 
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containing information, then this cannot be 'containing' in the straightforward sense in 
which books and fihng cabinets contain information (propositionally), for our brains and 
minds do not contain statements written down, nor even encoded. As we have noted, 
books, computers, and so on contain information about various topics because they are 
used by humans (language users) to store information. As Hacker remarks: 

...we do not use brains as we use computers. Indeed it makes no more sense 
to talk of storing information in the brain than it does to talk of having 
dictionaries or filing cards in the brain as opposed to having them in a 
bookcase or filing cabinet. (,Hacker. .1987.. p. 493) 

We do not stand to our brains as an external agent to an object of which we may make 
use to record or encode propositions, or on which to inscribe sentences. 

A particular danger that one faces if tempted to explain possessi ng information in 
terms of containing it, is of falling prey to the homunculus fallacy fcf. iKennvl . Il97l|) . 

The homunculus fallacy is to take predicates whose normal application is to complete 
human beings (or animals) and apply them to parts of animals, typically to brains, or 
indeed to any insufficiently human-like object. The fallacy properly so-called is attempt- 
ing to argue from the fact that a person-predicate applies to a person to the conclusion 
that it applies to his brain or vice versa. This form of argument is non-truth-preserving 
as it ignores the fact that the term in question must have a different meaning if it is to 
be applied in these different contexts. 

'Homunculus' means 'miniature man', from the Latin (the diminutive of homo). This 
is an appropriate name for the fallacy, for in its most transparent form it is tantamount 
to saying that there is a little man in our heads who sees, hears, thinks and so on. 
Because if, for example, we were to try to explain the fact that a person sees by saying 
that images are produced in his mind, brain or soul (or whatever) then we would not 
have offered any explanation, but merely postulated a little man who perceives the 
images. For exactly the same questions arise about what it is for the mind/brain/soul 
to perceive these images as we were trying to answer for the whole human being. This is 
a direct consequence of the fact that we are applying a predicate — 'sees' — that applies 
properly only to the whole human being to something which is merely a part of a human 
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being, and what is lacking is an explanation of what the term means in this application. 
It becomes very clear that the purported explanation of seeing in terms of images in the 
head is no explanation at all, when we reflect that it gives rise to an infinite regress. If 
we see in virtue of a little man perceiving images in our heads, then we need to explain 
what it is for him to perceive, which can only be in terms of another little man, and so 
on. 

The same would go, mutatis mutandis, for an attempt to explain possession of in- 
formation in terms of containing information propositionally. Somebody is required to 
read, store, decode and encode the various propositions, and peruse any pictures; and 
this leads to the regress of an army of little men. Again, the very same difficulty would 
arise for attempts to describe possessing information as containing information inferen- 
tially: now the miniature army is required to draw the inferences that allow knowledge 
to be gained from the presence of correlations. 

This last point indicates that a degree of circumspection is required when dealing 
with the common tendency to describe the mechanis ms of sensory perception in terms 



of information reaching the brain. In illustration (cf. 



Hacker 



1987), it has been known 



Hubel and Wiesell (|l979()) that there 



since the work of Hubel and Weisel (see for example 
exist systematic correlations between the responses of groups of cells in the visual striate 
cortex and certain specific goings-on in a subject's visual field. It seems very natural to 
describe the passage of nerve impulses resulting from retinal stimuli to particular regions 
of the visual cortex as visual information reaching the brain. This is unobjectionable, 
so long as it is recognised that this is not a passage of information in the sense in which 
information has a direct conceptual link to the acquisition of knowledge. In particular, 
the visual information is not information for the subject about about the things they 
have seen. The sense in which the brain contains visual information is rather the sense 
in which a tree contains information about past winters. 

Equipped with suitable apparatus, and because he knows about a correlation that 
exists, the neurophysiologist may make, from the response of certain cells in the visual 
cortex, an inference about what has happened in the subject's visual field. But the 
brain is in no position to make such an inference, nor, of course, an inference of any 
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kind. Containing visual information, then, is containing information inferentially, and 
trying to explain a person's possession of information about things seen as their brain 
containing visual information would lead to a homunculus regress: who is to make the 

inference that provides knowledge? 

This is not to deny the central importance and great interest of the scientific results 
describing the mechanisms of visual perception for our understanding of how a person can 
gain knowledge of the world surrounding them, but is to guard against an equivocation. 
The answers provided by brain science are to questions of the form: what are the causal 
mechanisms which underlie our ability to gain visual knowledge? This is misdescribed as 
a question of how information flows, if it is thought that the information in question is 
the information that the subject comes to possess. One might have 'information flow' in 
mind, though, merely as a picturesque way of describing the processes of electrochemical 
activity involved in perception, in analogy to the processes involved in the transmission 
of information by telephone and the like. This use is clearly unproblcmatic, so long as 
one is aware of the limits of the analogy. (We don't want the question to be suggested: 
so who answers the telephone? This would take us back to our homunculi.) 

1.2 The Shannon Information and related concepts 

The technical concept of information relevant to our discussion, the Shannon informa- 
tion, finds its home in the context of communication theory. We are concerned with a 
notion of quantity of information; and the notion of quantity of information is cashed out 
in terms of the resources required to transmit messages (which is, note, a very limited 
sense of quantity). I shall begin by highlighting two main ways in which the Shannon 
information may be understood, the first of which rests explicitly on Shannon's 1948 
noiseless coding theorem. 

1.2.1 Interpretation of the Shannon Information 

It is instructive to begin by quoting Shannon: 
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The fundamental problem of communication is that of reproducing at one 
point cither exactly or approximately a message selected at another point. 
Frequently these messages have meonm(;... These sem antic aspects of com- 
munication are irrelevant to the engineering problem. l|ShannonLll948l p. 31) 

The communication system consists of an information source, a transmitter or encoder, 
a (possibly noisy) channel, and a receiver (decoder). It must be able to deal with any 
possible message produced (a string of symbols selected in the source, or some varying 
waveform), hence it is quite irrelevant whether what is actually transmitted has any 
meaning or not, or whether what is selected at the source might convey anything to 
anybody at the receiving end. It might be added that Shannon arguably understates his 
case: in the majority of applications of communication theory, perhaps, the messages 
in question will not have meaning. For example, in the simple case of a telephone 
line, what is transmitted is not what is said into the telephone, but an analogue signal 
which records the sound waves made by the speaker, this analogue signal then being 
transmitted digitally following an encoding. 

It is crucial to realise that 'information' in Shannon's theory is not associated with 
individual messages, but rather characterises the source of the messages. The point of 
characterising the source is to discover what capacity is required in a communications 
channel to transmit all the messages the source produces; and it is for this that the 
concept of the Shannon information is introduced. The idea is that the statistical nature 
of a source can be used to reduce the capacity of channel required to transmit the 
messages it produces (we shall restrict ourselves to the case of discrete messages for 
simplicity) . 

Consider an ensemble X of letters {xi, X2, ■ ■ ■ , Xn} occurring with probabilities p{xi). 
This ensemble is our source^", from which messages of N letters are drawn. We are 
concerned with messages of very large N . For such messages, we know that typical 
sequences of letters will contain Np{xi) of letter Xi, Np{xj) of Xj and so on. The 
number of distinct typical sequences of letters is then given by 

m 

Npjxiy.Npjx^y. . . . Np{xr,y. 

^"More properly, this ensemble models the source. 
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and using Stirling's approximation, this becomes 2^^^'^\ where 

n 

H{X) = -Y,p{xi)\ogp{x,), (1.1) 
1=1 

is the Shannon information (logarithms are to base 2 to fix the units of information as 
binary bits). 

Now as A'^ — > cxD, the probability of an atypical sequence appearing becomes negligible 
and we are left with only 2^^^-^'^ equiprobable typical sequences which need ever be 
considered as possible messages. We can thus replace each typical sequence with a 
binary code number of NH{X) bits and send that to the receiver rather than the original 
message of N letters (A^logn bits). 

The message has been compressed from N letters to NH{X) bits (< A^logn bits). 
Shannon's noiseless coding theorem, of which this is a rough sketch, states that this rep- 
resents the optimal compression (Shannon 1948). The Shannon information is, then, ap- 
propriately called a measure of information because it represents the maximum amount 
that messages consisting of letters drawn from an ensemble X can be compressed. 

One may also make the derivative statement that the information per letter in a 
message is H{X) bits, which is equal to the information of the source. But 'derivative' 
is an important qualification: we can only consider a letter Xi drawn from an ensemble 
X to have associated with it the information H{X) if we consider it to be a member of 
a typical sequence of N letters, where N is large, drawn from the source. 

Note also that we must strenuously resist any temptation to conclude that because 
the Shannon information tells us the maximum amount a message drawn from an en- 
semble can be compressed, that it therefore tells us the irreducible meaning content of 
the message, specified in bits, which somehow possess their own intrinsic meaning. This 
idea rests on a failure to distingu ish between a code, which has no concern with meaning. 



and a language, which does (cf. 



Harri^ (|l987|)) 
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Information and Uncertainty 

Another way of thinking about the Shannon information is as a measure of the amount 
of information that we expect to gain on performing a probabilistic experiment. The 
Shannon measure is a measure of the uncertainty of a probability distribution as well as 
serving as a measure of information. A measure of uncertainty is a quantitative measure 
of the lack of concentration of a probability distribution; this is called an uncertainty be- 
cause it measures our uncertainty about what the outcome of an experimen t completely 
described by the probability distribution in question will be. UfBnk jlQQfj provides an 



axiomatic characterisation of measures of uncertainty, deriving a general class of mea- 
sures, Ur{p), of which the Shannon information is one (see also Maassen and UfFink 
1989). The key property possessed by these measures is Schur concavity (for details 
of the property of Schur concavity, see UfRnk (1990), Nielsen (2001) and Section [2.3. II 
below). 

Imagine a random probabilistic experiment described by a probability distribution 
p~ {p{xi), . . . ,p{xn)}- The intuitive link between uncertainty and information is that 
the greater the uncertainty of this distribution, the more we stand to gain from learning 
the outcome of the experiment. In the case of the Shannon information, this notion of 
how much we gain can be made more precise. 

Some care is required when we ask 'how much do we know about the outcome?' for 
a probabilistic experiment. In a certain sense, the shape of the probability distribution 
might provide no information about what an individual outcome will actually be, as 
any of the outcomes assigned non-zero probability can occur. However, we can use the 
probability distribution to put a value on any given outcome. If it is a likely one, then 
it will be no surprise if it occurs, so of little value; if an unlikely one, it is a surprise, 
hence of higher value. A nice measure for the value of the occurrence of outcome xt is 
— \ogp{xi), a decreasing function of the probability of the outcome. We may call this 
the 'surprise' information associated with outcome xf. it measures the value of having 
observed this outcome of the experiment (as opposed to: not bothering to observe it at 
all) given that we know the probability distribution for the outcomes^^. 

^^Of course, this is a highly restricted sense of 'value'. It does not, for example, refer to how much 
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If the information (in this restricted sense) that we would gain if outcome Xi were to 
occur is — \ogp{xi), then before the experiment, the amount of information we expect to 
gain is given by the expectation value of the 'surprise' information, ^ ■ p{xi){— logp(xi)); 
and this, of course, is just the Shannon information H of the probability distribution p. 
Hence the Shannon information tells us our expected information gain. 

More generally, though, any of the measures of uncertainty Ur{p) may be understood 
as measures of information gain; and a similar story can be told for measures of 'how 
much we know' given a probability distribution. These will be the inverses of an uncer- 
tainty: we want a measure of the concentration of a probability distribution; the more 
concentrated, the more we know about what the outcome will be; which just means, the 
better we can predict the outcome. (To say in this way that we have certain amount of 
information (knowledge) about what the outcome of an experiment will be, therefore, 
is not to claim that we have partial knowledge of some predetermined fact about the 
outcome of an experiment.) 

The minimum number of questions needed to specify a sequence 

The final common interpretation of the Shannon information is as the minimum average 
number of binary questions needed to specify a sequence drawn from an ensemble (UfBnk 
1990; Ash 1965), although this appears not to provide an interpretation of the Shannon 
information actually independent of the previous two. 

Imagine that a long sequence A'' of letters is drawn from the ensemble X, or that 
N independent experiments whose possible outcomes have probabilities p{xi) are per- 
formed, but the list of outcomes is kept from us. Our task is to determine what the 
sequence is by asking questions to which the guardian of the sequence can only answer 
'yes' or 'no'; and we choose to do so in such a manner as to minimize the average number 
of questions needed. We need to be concerned with the average number to rule out lucky 
guesses identifying the sequence. 

might be implied by this particular outcome having occurred, nor to the value of what might be learnt 
from it, nor the value of what it conveys (if anything); these ideas all lie on the 'everyday concept of 
information' side that is not being addressed here. The distinction between the surprise information 
and the everyday concept becomes very clear when one reflects that what one learns from a particulaj: 
outcome may well be, in fact generally will be, quite independent of the probability assigned to it. 
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If we are trying to minimize the average number of questions, it is evident that the 
best questioning strategy will be one that attempts to rule out half the possibilities with 
each question, for then whatever the answer turns out to be, we still get the maximum 
value from each question. Given the probability distribution, wc may attempt to im- 
plement this strategy by dividing the possible outcomes of each individual experiment 
into classes of equal probability, and then asking whether or not the outcome lies in 
one of these classes. We then try and repeat this process, dividing the remaining set of 
possible outcomes into two sets of equal probabilities, and so on. It is in general not 
possible to proceed in this manner, dividing a finite set of possible outcomes into two 
sets of equal probabilities, and it can be shown that in consequence the average number 
of questions required if we ask about each individual experiment in isolation is greater 
than or equal to H{X). However, if we consider the N repeated experiments, where N 
tends to infinity, and consider asking joint questions about what the outcomes of the 
independent experiments were, we can always divide the classes of possibilities of (joint) 
outcomes in the required way. Now we already know that for large N , there are 2^^'^) 
typical sequences, so given that we can strike out half the possible sequences with each 
question, the minimum average number of questions needed to identify the sequence is 
NH{X). (These last results are again essentially the noiseless coding theorem.) 

It is not immediately obvious, however, why the minimum average number of ques- 
tions needed to specify a sequence should be related to a notion of information. (Again, 
the tendency to think of bits and binary questions as irreducible meaning elements is to 
be resisted.) It seems, in fact that this is either just another way of talking about the 
maximum amount that messages drawn from a given ensemble can be compressed, in 
which case we are back to the interpretation of the Shannon information in terms of the 
noiseless coding theorem, or it is providing a particular way of characterising how much 
we stand to gain from learning a typical sequence, and we return to an interpretation 
in terms of our expected information gain. 
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1.2.2 More on communication channels 

So far we have concentrated on only one aspect of describing a communication sys- 
tem, namely, on characterising the information source. The other important task is to 
characterise the communication channel. 

A channel is defined as a device with a set {xi} of input states, which are mapped 
to a set {yj} of output states. If a channel is noisy then this mapping will not be one- 
to-one. A given input could give rise to a variety of output states, as a result of noise. 
The basic type of channel — the discrete memoryless channel — is characterised in terms 
of the conditional probabilities p{yj\xi): given that input Xi is prepared, what is the 
probability that output yj will be produced? 

If the distribution, p{xi), for the probability with which the various inputs will be 
prepared is also specified, then we may calculate the joint distribution p{xi A yj). We 
may consider which input state is prepared on a given use of the channel to be a random 
variable X, with p{X — Xi) = p{xi); which output produced to be a random variable Y, 
p{Y = yj) = p{yj)] and we may consider also the joint random variable X AY, where 
p{X AY ^ XiA yj) = p{xi A yj). 

The joint distribution p{xi A yj) allows us to define the joint uncertainty 

H(X A r) = - ^p(a:, A y^) logpix, A y^), (1.2) 
and an important quantity known as the 'conditional entropy': 

H{X\Y) = ^p(j/,)(-^p(a;,|y,)logp(x,|y,)). (1.3) 

3 i 

The scare quotes are significant, as this quantity is not actually an entropy or uncertainty 
itself, but is rather the average of the uncertainties of the conditional distributions for the 
input, given a particular Y output. It measures the average of how uncertain someone 
will be about the X value when they have observed an output Y value. 

I — in 

As lUflnnkI (|199(1 §1.6.6) notes, it pays to attend to the fact that H{X\Y) is not a 
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measure of uncertainty. It is easy to show (e.g. Ash . IQBfl Thm.1.4.3-5) that 
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H{X\Y) < H{X), with equahty iff X and Y are independent; (1.4) 

and it is often held that this is a particularly appealing feature of the Shannon measure 
of information because it captures the intuitive idea that by learning the value of y, we 
gain some information about X, therefore our uncertainty in the value of X should go 
down (unless the two are independent). Thus, Shannon describes the inequality H1.4|l as 
follows: 

The uncertainty of X is never increased by knowledge of y. It will be 
decreased unless Y and X are independent events, in which case it is not 
changed. (jShannonl . Hoii p.53) 

But this description is highly misleading. As Uffink remarks, one's uncertainty cer- 
tainly can increase following an observation: increasing knowledge need not lead to a 
decrease in uncertainty. This is well illustrated by Uffink's 'keys' example: my keys are 
in my pocket with a high probability, if not, they could be in a hundred places all with 
equal (low) probability. This distribution is highly concentrated so my uncertainty is 
low. If I look, however, and find that my keys are not in my pocket, then my uncertainty 
as to their whereabouts increases enormously. An increase in knowledge has led to an 
increase in uncertainty. 

This does not conflict with the inequality 11. 4() . of course, as the latter involves 
an average over post-observation uncertainties. Uffink remarks, against 
p. 186) for example, that 



...there is no paradox in an increase of uncertainty about the outcome of an 
experiment as a result of information about its distribution. The confusion 
is caused by a liberal use of the multifaceted term information, and also by 
the deceptive name of conditional entropy for what is ac tually an average of 
the entropies of conditional distributions. llUffinkl . ll99(l p.83) 

To see why the conditional entropy is important, consider a very large number N of 
repeated uses of our channel. There are 

2NH{X) typical X (input) sequences that could 
arise, 2^^*^^' typical output sequences that could be produced, and 2^^('^^^) typical 



JavnesI l|l957 
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sequences of pairs of X, Y values that could obtain. Suppose someone observes which 
Y sequence has actually been produced. If the channel is noisy, then there is more than 
one input X sequence that could have given rise to it. The conditional entropy measures 
the number of possible input sequences that could have given rise to the observed output 
(with non- vanishing probability). 

If there are sequences of pairs of X, Y values, then the number of 

typical X sequences that could result in the production of a given Y sequence will be 
given by 

0NH(X/\Y) 

^ ^ cyN{H{Xl\Y)-H{Y)) 

2NH{Y) 

Due to the logarithmic form of H, H{X AY) = H{Y)+H{X\Y), and it follows that the 
nu mber of input se quences consistent with a given output sequence will be 2^^('^l^). 



Shannon! 1)19481 §12) points out that this means that if one is trying to use a noisy 
channel to send a message, then the conditional entropy specifies the number of bits per 
letter that would need to be sent by an auxiliary noiseless channel in order to correct 
all the errors that have crept into the transmitted sequence, as a result of the noise. If 
input and output states are perfectly correlated, i.e., there is no noise, then obviously 
H{X\Y) = 0. 

Another most important quantity is the mutual information, H{X : Y), defined as 



H{X -.Y) = H{X) - H{X\Y). (1.5) 

It follows from Shannon's noisy coding theorem (1948) that the mutual information 
H{X : Y) governs the rate at which information may be sent over a channel with input 
distribution p{xi), with vanishingly small probability of error. 

The following sorts of heuristic interpretations of the mutual information may also 
be given: With a noiseless channel, an output Y sequence would contain as much in- 
formation as the input X sequence, i.e., NH{X) bits. If there is noise, it will contain 
less. We know, however, that H{X\Y) measures the number of bits per letter needed 
to correct an observed Y sequence, therefore the amount of information this sequence 
actually contains will be NH{X) - NH{X\Y) = NH{X : Y) bits. 
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Or again, we can say that NH{X : Y) provides a measure of the amount that we 
are able learn about the identity of an input X sequence from observing the output Y 
sequence: There are 2^^(^l^) input sequences that will be compatible with an observed 
output sequence, and the size of this group, as a fraction of the total number of possible 
input sequences, may be used a measure of how much we have narrowed down the 
identity of the X sequence by observing the Y sequence. This fractional size is 

2NH{X\Y) I 



2NH{X) 2^^(^-^) ' 

and the smaller this fraction — hence the greater H{X : Y) — the more one learns from 
learning the Y sequence. 

The most important interpretation of the mutual information does derive from the 
noisy coding theorem, however. Consider, as usual, sequences of length N , where N is 
large; the input distribution to our channel is p{xi). Roughly speaking, the noisy coding 
theorem tells us that it is possible to find 2^^('^-^' X sequences of length N (code 
words) such that on observation of the Y sequence produced following preparation of 
one of these code words, it is possible to infer which X sequ ence was prepar ed, with a 



IMg). So if we 



probability of error that tends to zero as N tends to infinity l|ShannonL 
were now to consider an information source W ^ producing messages with an information 
of H{W) = H{X : y), each output sequence of length N from this source could be 
associated with an X code word, and hence messages from W be sent over the channel 
with arbitrarily small error as N is increased^^. 

The capacity, C, of a channel is defined as the suprcmum over all input distributions 
p{xi) of H{X : Y). The noiseless coding theorem states that given a channel with 
capacity C and an information source with an information oi H < C, there exists a 
coding system such that the output of the source can be transmitted over the channel 
with an arbitrarily small frequency of errors. 

^^This result is particularly striking as it is not intuitively obvious that in the presence of noise, 
arbitrarily good transmission may be achieved without the per letter rate of information transmission 
also tending to zero. The noisy coding theorem assures us that it can be achieved. 
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1.2.3 Interlude: Abstract/concrete; technical, everyday 

Part of my aim in this chapter has been to deflect the pressure of the question 'What 
is information?' by foUowing the lead of Austin (and, of course, Wittgenstein^'^) and 
pointing to the fact that 'information' is an abstract noun: correspondingly we should 
not seek to illuminate the term by attempting fruitlessly to grasp for something that 
it corresponds or refers to, but by considering simple examples of its function and in 
particular considering its relations to grammatically simpler and less mystifying terms 
like 'inform'. 

Now, when turning to information in the technical sense of Shannon's theory, we 
explicitly do not seek to understand this noun by comparison with the verb 'inform'. 
'Information' in the technical sense is evidently not derived from a nominalization of 
this verb. Nonetheless, 'information' remains an abstract, rather than a concrete noim: 
it doesn't serve to refer to a material thing or substance. In this regard, note that 
the distinction 'abstract/concrete' as applied to nouns does not map onto a distinction 
between physical concepts and concepts belonging to other categories. Thus the fact that 
'information', in the technical sense of Shannon's theory, may be included as a concept 
specifiable in physical terms does not entail that it stands for a concrete particular, entity 
or substance. For example, energy is a paradigmatic physical concept (to use another 
relevant term, energy is a physical quantity), yet 'energy' is an abstract (mass) noun 
(akin to a property name). The interesting differences that exist between energy and 
the technical notion of information as examples of physical quantities deserve further 
analysis. See Chapter |21 Sections 13.41 13.61 for some remarks in this direction. 

Why my insistence that 'information' in the technical sense remains an abstract 
noun? Well, consider that two strategies present themselves for providing an answer to 
the question 'What is information' in the case of information theory. On the first the 
answer is: what is quantified by the Shannon information and mutual information. On 
the second it is: what is transmitted by information sources. These different strategies 

^^'The questions "What is length?", "What is meaning?", "What is the number one?" etc., produce 
in us a mental cramp. We feel that we can't point to anything in reply to them and yet ought to point 
to something. (We are up against one of the great s ources of philosophic al bewilderement: a substantive 
makes us look for a thing that corresponds to it.)' IWittgensteinI il95^ . p.l). 
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provide differing, but complementary answers. Under both, however, 'information' is an 
abstract noun. 

Taking the first strategy, one considers what is quantified by the Shannon infor- 
mation and mutual information. As we have seen, the Shannon information serves to 
quantify how much messages produced by a source can be compressed and the mutual 
information quantifies the capacity of a channel (for a particular input source distribu- 
tion) to transmit messages. But this is evidently not to quantify an amount of stuff 
(even of some very diaphanous kind); and the amount that messages can be compressed 
and the capacity of a channel are no more concrete things than the size of my shoe is a 
concrete thing. 

Now consider the second strategy. Recall our earlier quotation from Shannon. There 
he described the fundamental aim of communication theory as that of reproducing at one 
point a message that was selected at another point. Thus we might say (very roughly) 
that in the technical case, information is what it is the aim of a commimication protocol 
to transmit: information (in the technical sense) is what is produced by an information 
source that is required to be reproduced if the transmission is to be counted a success^'*. 

However, the pertinent sense of 'what is produced' is not the one pointing us towards 
the concrete systems that are produced by the source on a given occasion, but rather 
the one which points us towards the particular type (sequence or structure) that these 
tokens instantiate. But a type is not a concrete thing, hence 'information', in this 
technical sense, remains an abstract noun. 

So, for example, if the source X produces a string of letters like the following: 

X2XiX3XiXi . . . X2XiX'jXiXA, 

say, then the type is the sequence ^X2XiX:>,xiXji . . .X2XiXTXiXji\ we might name this 
'sequence 17'. The aim is to produce at the receiving end of the communication chan- 
nel another token of this type. What has been transmitted, though, the information 

^''Notc that this formulation is left deliberately open. What counts as successful transmission and 
therefore, indeed, as what one is trying to transmit, depends upon one's aims and interests in setting 
up a communication protocol. 



CHAPTER 1. CONCEPTS OF INFORMATION 22 

transmitted on this run of the protocol, is sequence 17; and this is not a concrete thing. 

At this point we may draw an illustrative, albeit partial, analogy with information 
in the everyday sense. Imagine that I write down a message to a friend on a piece of 
paper (using dcclaritivc sentences, to keep things simple); one will distinguish in the 
standard way between the sentence tokens inscribed and what is said by the sentences: 
the propositions expressed^^. It is the latter, what is said, that is the information 
(everyday sense) I wish to convey. Similarly with information in the technical sense just 
described: one should distinguish between the concrete systems that the source outputs 
and the type that this output instantiates. Again, it is the latter that is important; this 
is the information (technical sense) that one is seeking to transmit. 

An important disanalogy between the technical and everyday notions of information 
now forcibly presents itself: the restatement of a by- now familiar point. In the everyday 
case, when I have written down my message to my friend, one not only has the sentence 
tokens and the sentence type they instantiate but also the propositions these sentences 
express; and again, it is these last that are the information I wish to convey. In the case 
we have just outlined for the information-theoretic notion of information, though, one 
only has the tokens produced by the source and the type they instantiate; it is this type 
that is transmitted, that constitutes the information in the technical sense we have just 
sketched. The further level, if any, of what various types might mean, or what instances 
of these types might convey, is not relevant to, or discussed by information theory: the 
point once more that information in the technical sense is not a semantic notion. Indeed, 
considered from the point of view of information theory, the output of an information 
source does not even have any syntactic structure. 



1.3 Aspects of Quantum Information 

Quantum information is a rich theory that seeks to describe and make use of the distinc- 
tive possibilities for information processing and communication that quantum systems 

provide. What draws the discipline together is the recognition that far from quantum 

^® Note, of course, that the propositions expressed are not to be identified with the sentence types of 
which the tokens I write are pajrticulcir instances. (Consider, for example, indexicals.) 
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behaviour presenting a potential nuisance for computation and information transmission 
(in hght of the trend towards increasing miniaturisation) the fact that the properties 
of quantum systems differ so markedly from those of classical objects actually provides 
opportunities for interesting new communication protocols and forms of information pro- 
cessing. Entanglement and non-commutativity, two essentially quantum features, can 
be used. 

To give some examples: 



DeutschI lH 



introduced the concept of the universal 



quantum computer, and the evidence suggests that quantum computers are exponentially 
more powerful than classic al computational models for the important task of factoring 



large numbers ljShon . 11994) : meanwhile quantum cryptography makes use of the fact that 



non-orthogonal quantum states cann ot be perfectly distinguished in designing protocols 

3 



for sharing secret random keys (e.g., 



Bennett and Brassarc 



1984^ thus holding out the 



promise of security of communicati on guarantee d by the laws of physics; entanglement 



may also be used in such protocols l|Ekert , 



19911) 



Although the field of quantum information began to emerge in the mid-1980s, the 
concept of quantum information itself was not truly available until the quantum analogue 



Schumacher 


1995 


Jozsa and Schumacher 


1994) 
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Quantum information theory may be considered as an extension of classical informa- 
tion theory which introduces new communication primitives, e.g., the qubit (two-state 
quantum system) and shared entanglement, while providing quantum generalisations 
of the notions of sources, channels and codes. We will now review a selection of re- 
sults (by no means comprehensive) that will be relevan t to what follows. For sys - 



tematic presentations of quantum informa t ion theory, see 



Bouwmeesteretal 



1 2000) : 



Preskilll (19981 



Nielsen and Chuand ll2000l): 



Bennett and Shod l|1998() . lEkert and Jozsa 



also provides a nice review of quantum computation up to and including the 
development of Shor's algorithm. 

The first type of task one might consider consists of using quantum systems to 

Historical note: Chris Fuchs has informed me that Ben Schumacher recollects first presenting the 
notion of quantum information at the IEEE meeting on the Physics of Computation in Dallas in October 
1992. The germ of the idea and the term 'qubit' arose in conversation between Schumacher and Wootters 
some months earlier. 
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transmit classical information. Whilst we are used to thinking that an n-dimensional 
quantum system possesses at most n mutually distinguishable (i.e. orthogonal) states 
in which it might be prepared, one is also free to prepare such a system in one of 
any number of non-orthogonal states. The price, of course, is that one will not then 
be able to determine perfectly which state was prepared. This already makes things 
interesting. It forces us to draw a distinction that is not needed in the classical case, 
that is, a distinction between the amount of information that is used to prepare, or is 
needed to specify, the state of a quantum system (the specification information) and the 
information that has actually been encoded into a system (the accessible information). 

So, consider a classical information source. A, that has outputs ai,i — \ . . .k, which 
occur with probabilities p{ai). We will attempt to encode the output of this source into 
sequences of n-dimensional quantum systems, as follows. On receipt of output of A, a 
quantum system is prepared in the signal state pa^ ■ These signal states may be pure or 
mixed, and may or may not be orthogonal. If the number of outputs, fc, of the classical 
source is greater than n, though, the signal states will have to be non-orthogonal. 

We may consider sequences of length N of signal states being prepared in this manner, 
where N is very large. The amount of information needed to specify this sequence will 
be NH{A) bits. The specification information, then, is the number of bits per system 
in the sequence needed to specify the whole sequence of states, and is given by the 
information of the classical source. 

The quan tum analogue of the Shannon information H is the von Neumann entropy 



where p is a density operator on an n-dimcnsional Hilbcrt space and the are its eigen- 
values. For very large N , the sequence of quantum systems produced by our preparation 
procedure may be considered as an ensemble described by the density operator 

k 

P = ^p{ai)Pa,- (1.7) 

1=1 

Equally, if one does not know the output of the classical source on a given run of the 



i Wehr] 



1978^ 



n 




(1.6) 
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preparation procedure, then the state of the individual system prepared on that run 
may also be described by this density operator. 

The von Neumann entropy takes its maximum value, logn, when p is ma ximally 
mixe d, and its minimum value, zero, if p is pure. It also satisfies the inequality IjWehrj 



S{J2p{a^)Pa^) < H{A) + Y,Pi^^)S{PaJ, (1.8) 
i—1 i 

which holds with equality iff the pa^ are mutually orthogonal, i.e., PaiPaj — O,for i ^ j. 
Thus the specification information of the sequence, which is limited only by the number 
of outputs k of the classical source, may be much greater than its von Neumann entropy, 
which is limited by the dimensionality of our quantum systems. 

So, how much information have we actually managed to encode into these quantum 
systems? To answer this question we need to consider making measurements on the 
systems, and the resulting mutual information H{A : B), where B labels the observable 
measured, having outcomes bj, with probabilities p{bj), j = 1. ..m. Taking 'encoded' 
to be a 'success' word (something cannot be said to have been encoded if it cannot in 
principle be decoded), then the maximum amount of in f ormat ion encoded in a system 



is given by the accessible information (cf. 



Schumacher 



1995fl . that is, the maximum 



over all decod ing observables of the mutual information. A well known result due to 



Holevol l|1973j) provides an upper bound on the mutual information resulting from the 
measurement of any observable, including positive operator valued (POV) measurements 
(which, recall, may have more outcomes than the dimensionality of the system being 
measured). This bound is: 

k 

H{A:B)<S{p)-Y,p{a,)S{pa^), (1.9) 

i 

with equality iff the pa^ commute. 

The Holevo bound H1.9|l implies the weaker inequality 



H{A : B) < S{p) < logn. 
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reinforcing our intuitive understanding that the maximum amount of information that 
may be encoded into a quantum system is hmited by the number of orthogonal states 
available, i.e., by the dimension of the system's Hilbert space (even if we allow ourselves 
POV measurements to try to distinguish better non-orthogonal states). In particular, 
note that for a single qubit, the most that can be encoded is one bit of information. 
Again, from the Holevo bound and inequality (|1.8I) it follows that 



HiA : B) < Sip) - ^p(a,)5(paj < -ff(A). 



The inequality on the right hand side will be strict if the encoding states are not 
orthogonal, implying that the accessible information will be strictly less than the spec- 
ification information H{A) in this case. This is a way of making precise the intuition 
that when encoding in non-orthogonal states, it is not possible to determine which states 
were prepared. If H{A : B) < H{A) for any measurement B, then it is impossible to 
determine accurately what sequence of states was prepared by performing measurements 
on the sequence. 

Let us now look at quantum coding. Rather than beginning by considering a classical 
source, we could instead begin with a quantum source. If a classical source is modelled 
by an ensemble A from which letters Ui are drawn with probabilities p{ai), the quantimr 
source will be mode lled similarly by an ensemble of systems in states , produced with 



probabilities p{ai) (jSchumachei 



1995(1 . We will assume these states to be pure, = 
\ai) {ai\. Then, just as Shannon's noiseless coding theorem introduces the concept of the 
bit as a measure of information, the quantum noiseless coding theorem introduces the 
concept of the qubit as a measure of quantum information, characterising the quantum 
source. 

By an ingenious argument, the quantum noiseless coding theorem runs parallel to 
Shannon's noiseless coding theorem, using much the same mathematical ideas. If we 
consider a long sequence of N systems drawn from the quantum source, their joint state 
will be 

= pi ® p2 ^ . . . ® p^, 
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where is the density operator for the ith system, given by eqn. 1)1. 7|l . with p^. = 
\ai){ai\. In the classical case, for large enough N, we needed only to consider sending 
typical sequences of outcomes, of which there were 2^^^"*' for a source A, as only 
these had non-vanishing probability. Similarly in the quantum case, for large enough 
N, the joint state p®^ will have support on two orthogonal subspaces, one of which, 
the typical subspace, will have dimension 2^'^('') and will carry the vast majority of 
the weigh t of p^^ , whilst th e other subspace will have vanishingly small weight as 



N 



oo l| Schumacher 



122^ '. Because of this, the state may be transmitted 



with arbitr arily small error by being encoded onto a chan nel system of only 2^^^''^ 



dimensions (jSch^imacher 



Tozsa and Schumacher 



19941) . for example, onto NS{p) 



qubits. These channel systems may then be sent to the receiver and the original state 
recovered with near perfect fidelity. Thus, analogously to the classical case, we have 
a measure of the resources (now quantum resources, mind) required to transmit what 
is produced by our quantum source. The von Neumann entropy provides a measure, 
in qubits, of the amount by which the output of our source may be compressed, hence 
provides a measure of the amount of quantum information the source produces^^. 

The use of entanglement as a communication resource is a centrally impor- 
tant feature of quantum information theory. The two paradigmatic examples of 
enta nglement-assisted communic ation are dense coding and teleportation. In dense cod- 



ing IjBennett and Weisner 



19921) prior shared entanglement between two widely sepa- 



rated parties, Alice and Bob, allows Alice to transmit to Bob two bits of information 
when she only sends him a single qubit. This would be impossible if they did not 
share a maximally entangled state, e.g., the singlet state (one of the four Bell states, 
see Table ll.lf) beforehand. The trick is that Alice may use a local unitary operation 
to change the global state of the entangled pair into one of four different orthogonal 
states. If she then sends Bob her half of the entangled pair he may perform a suitable 

^'^To see this, note that can be written as a weighted sum of Af-fold tensor products of one 

dimensional eigenprojectors of p, with weights given by the products of the corresponding eigenvalues 
Xi of p. For large A'^ there will be 2^^'^' , with H{\) = — "^27=1 '°S equiprobable typical sequences 
of eigenprojectors in this sum, i.e., sequences in which the relative frequency of occurrence of a given 
projector is equal to its associated eigenvalue, while all other sequences in the sum have very small 
weight. But — J2"—i log-^i j^^t the von Neumann entropy S{p). 

^*The converse to the quantum noiseless co ding theorem, that 2^^ ^P^ qubits are necessary for accurate 
transmission was proved in full generality bv lBarnum et all il996bt) . 
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10+) 



l/x/2(|t)|T> 

i/V2(|T)|T> 

l/A/2(|t)U) 

i/V2(|T)U) 



U>U» 1 
U>U» 
U>IT)) 
U)IT)) J 



-i(Jy (g) 1|V'~) 

-CTj; 1|V'~) 
(T^ ® IIt/-^) 



Table 1.1: The four Bell states, a maximally entangled basis for 2 ® 2 dim. systems. A 
choice of one of four of the operations {l,ax,(Jy,(Jz} applied to her system by Alice may 
transform, for example, the singlet state to one of the other three states orthogonal to 
it. 



joint measurement to determine which operation she applied; thence acquiring two bits 
of information. 



1993|) instead of being used 



In the teleportation protocol, by contrast ijBennett et al. 
to help send classical information, shared entanglement is used to transmit an unknown 
quantum state from Alice to Bob, with, remarkably, nothing that bears any relation to 
the identity of the state travelling between them. Furthermore, during the protocol, the 
state being teleported 'disappears' from Alice's location before 'reappearing' at Bob's 
a little while later, thus providing the inspiration for the science fiction title of the 
protocol. Also during the protocol, the intial shared entanglement is destroyed. One 
ebit (the amount of entanglement in a maximally entangled state of a 2 2 system) is 
used up in teleporting an unknown qubit state from one location to another. 

Since teleportation is a linear process it may also be used for entanglement swapping. 
Let's say that Alice and Bob, who are widely spatially separated, share a maximally 
entangled state of a pair of particles labelled 3 and 4. Alice may decide to perform 
the teleportation operation on a system, 2, which is half of an entangled pair, 1 and 
2. Following the protocol, the entanglement between 1 and 2, and between 3 and 4 is 
destroyed, but 1 and 4 will end up entangled, whereas before they had been uncorrelated. 
The entanglement of 1 and 2 has been swapped onto entanglement of systems 1 and 4. 
We shall be considering dense coding and teleportation in detail in later chapters. 
We should note, finally, a ve ry important restricting principle for qu antum protocols. 
This is the no cloning theorem (|Dieksl . ll982ilWootters anc I Ziireklllflsi : It is impossible 
to make copies of an unknown quantum state. This marks a considerable difference with 
classical information processing protocols, as in the classical case, the value of a bit may 
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be freely copied into numerous other systems, perhaps by measuring the original bit 
to see its value, and then preparing many other bits with this value. The same is not 
possible with quantum systems. As is well known, it is impossible to d etermine the state 
of a single quantum system by measurement (for a nice discussion see lBuschI 1)199711 V so 
the measuring approach would clearly be a non-starter^^. 

To see that no more general scheme would be possible either, consider a device 
that makes a copy of an unknown state \a) . This would be implemented by a unitary 
evolution^° U that takes the product |q!)|'!/'o)j where j^/'o) is a standard state, to the 
product I a) I a). Now consider another possible state Suppose the device can copy 
this state too: C/|/3)|?/'o) — 1/3)1/3)- If it is to clone a general unknown state, however, it 
must be able to copy a superposition such as |^) = l/\/2(|a) + \(3)) also, but the effect 
of U on 1^) is to produce a fully entangled state l/^/2{\a)\a) + \(3)\f3)) rather than the 
required |0I0- It follows that no general cloning device is possible. 

In fact it may be seen in the following way that if a device can clone more than 
one state, then these states must belong to an orthogonal set. We are supposing that 
C/ja) IV'o) — |q;)|q;) and J7|/3) j-i/'o) — 1/3)1/3). Taking the inner product of the first equation 
with the second implies that {a\(3) = (a|/3)^, which is only satisfied if (a|/3) = or 1, 
i.e., only if \a) and \f3) are identical or orthogonal. 



1.4 Information is Physical: The Dilemma 

A very striking claim runs through much of the literature in quantum information theory 
and quantum computation. This is the claim that 'Information is Physical'. From the 
conceptual point of view, however, this statement is rather baffling; and it is perhaps 
somewhat obscure precisely what it might mean. Be that as it may, the slogan is often 

^®It is in the context of state determination and superluminal signalling that the question of cloning 
first arose. If it were possible to clone an unknown quantum state, then we could multiply up an 
individual system into a whole ensemble in the same state; and it would then be quite possible to 
determine what that state was (see, e.g., Section 12.3.21 . This, of course, would then give rise to the 
possibility of superluminal signalling using entanglement in an EPR-type setting: one would be able to 
distinguish between different preparations of the same density matrix, hence determine superluminally 
which measurement was performed on a distant half of an EPR pair. 

^"is it too restrictive to consider only unitary evolutions? One can always consider a non-unitary 
evolution, e.g. measurement, as a unitary evolution on a larger space. Introducing auxiliary systems, 
perhaps including the state of the apparatus, doesn't affect the argument. 
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presented as the fundamental insight at the heart of quantum information theory; and 
it is frequently claimed to be entailed, or at least suggested, by the theoretical and 
practical advances of quantum information and computation'^^. 

However, it would seem that the slogan 'Information is Physical' faces a difficult 
dilemma. If it is supposed to refer to information in the everyday sense then, whatever 
its precise meaning, it certainly implies a very strong reductionist claim. It would have 
to amount, amongst other things, to a claim that central semantic and mental attributes 
or concepts are reducible to physical ones. This, however, is a purely philosophical claim, 
and a contentious one at that. As such it is hard to see how it could be supported by 
the claims and successes in physics of quantum information theory. 

So is 'information' in the slogan supposed to construed in the technical sense, then? 
Well, perhaps. But if so, then the claim is merely that some physically defined quantity 
is physical; and that is hardly an earth-shattering revelation. In particular it is now 
hard to see how it could represent an important new theoretical insight^^. 

Of course, there is in philosophy a tradition occupied by those who hope, or expect, 
to achieve the reduction of semantic and related concepts to respectable physical ones. 
Representatives of this tradition we might term the semantic naturalizers. We will 



discuss iDretskd l|1981|) as a well known example of such an approach briefly, below. The 
semantic naturalizer would not jib at the claim that information in the everyday sense 
is physical; indeed would undoubtedly endorse it. However, this does not affect the 
point that if 'Information is Physical' adverts to information in the everyday sense, then 
what is at issue is a philosophical claim about the relations between different groups 
of concepts; and quantum information theory does not engage in this debate. Rather, 
as we have seen, this piece of physical theory seeks to describe the distinctive ways in 
which quantum systems, with all their unusual properties, may be used for various tasks 
of information processing and transmission. It does not, therefore, adjudicate upon, 
nor provide evidence for or against a philosophical claim concerning the reduction of 
semantic properties to physical ones; and it is none the worse for that. 

Perhaps t he most vociferous p roponent of the idea that information is physical was the late Rolf 
Landauer Ce.g. lLandauelllQQllllQQQV 

Another possible reading of the slogan will be discussed briefly at the end of Chapter l6l 
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In any case, it should be noted that the success, and even well-groundedness, of the 
project of naturahzmg semantic properties can hardly be said to be a settled question. 
Certain l y, a su ccessful completion of the project has by no means been achieved. While 



I Loewer 



mi 



presents an up-beat account of progress, two recent sympathetic reviews 



McLaughlin and Rev 



19981) suggest that the project has yet to overcome 
important systematic difficulties. 

As noted in these reviews, proposals for naturalizing semantics typically fa ce two 



sorts o f problems, whose ancestry, in fact, may be traced back to difficulties that 



Qri££ 



1 19571) raised for the crude causal theory of meaning. These are what may be called the 
problem of error and the problem of fine grain. 

In brief: it is an essential part of a proposal to naturalize semantics that an account 
be given of the content of beliefs (or of propositional attitudes in general) . The problem 
of error relates to the feature of intentionality mentioned earlier: one might believe that 
p when p is not the case; and this is hard to accomodate in a naturalized account of 
content. (A very simple illustration: we might suggest that one has the belief that p 
when one's belief is caused by the fact that p. But then one could only believe that p 
if p were the case; and this is false.) The problem of fine grain is in articulating the 
detailed structure of what is believed without using linguistic resources, as semantic 
relations have a finer grain than causal ones. (To use a hackneyed example, my belief 
that a; is a creature with a heart is distinct from my belief that x is a creature with a 
kidney, yet the properties of having a heart and having a kidney are (nomologically) co- 
instantiated. Whatever is caused by a creature that has a heart is caused by a creature 
that has a kidney.) There is no consensus on whether these problems have been, or 
can be, satisfactorily addressed while an account still mantains its credentials as a fully 
naturalistic one. 

Moreover, we should note that there are many who would be inclined to argue that 
there is system in our apparent failure to provide a satisfactory naturalized account 
of semantics thus far. The pertinent thought is that language, being a rule governed 
activity, has an essential normative component that cannot be captured by any natural- 
istic explanation. The impetus behind this line of thought derives from Wittgenstein's 
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reflections on meaning and rule-following IjWittgensteiii 



Returning to quantum information theory, the following quotation from a recent 
article in Reviews of Modern Physics provides an apt illustration of the problematic 
claim that 'Information is Physical'. 

What is... surprising is the fact that quantum physics may influence the field 
of information and computation in a new and profound way, getting at the 
very root of their foundations... 

But why has this happened? It all began by realizing that information has 
a physical nature (Landauer, 1991;1996;1961). It is printed on a physical 
support..., it cannot be transported faster than light in vacuum, and it abides 
by natural laws. The statement that information is physical does not simply 
mean that a computer is a physical object, but in addition that information 
itself is a physical entity. 

In turn, this implies that the laws of information are restricted or gov- 
erned by the laws of physics . In particular, those of quantum physics. 
llCalirido a,nd Ma,rtiri-De1ga,dnl Efinl 

Whilst illustrating the problem, this passage also invites a simple response, one indicat- 
ing the lines of a solution. 

Let's pick out three phrases: 

1. 'The statement that information is physical does not simply mean that a computer 
is a physical object' 

2. 'in addition... information itself is a physical entity' 

3. 'In turn, this implies that the laws of information are restricted or governed by 
the laws of physics.' 

Statement (2) is the one that purports to be presenting us with a novel ontological 
insight deriving from, or perhaps driving, quantum information theory. The difficulty is 
in understanding what this portentous sounding phrase might mean and, most especially, 
understanding what role it is supposed to play. 

For it is statement (3) (with 'laws of information' understood as 'laws governing in- 
formation processing') that really seems to be the important proposition, if our interest 
is what information processing is possible using physical systems, as it is in quantum 
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information theory. And (2) is entirely unnecessary to establish (3), despite their con- 
catenation in the quotation above. All that we in fact require is part of statement 
(1): computers, or more generally, information processing devices, are physical objects. 
What one can do with them is necessarily restricted by the laws of physics. 

Quantum information theory and quantum computation are theories about what 
we can do using physical systems, stemming from the recognition that the peculiar 
characteristics of quantum systems might provide opportunities rather than drawbacks. 
This project is evidently quite independent of any philosophical claim regarding the 
everyday concept of information. There is therefore no need for the quantum information 
scientist to take a stand on contentious questions such as whether semantic and mental 
concepts are reducible to physical ones. We have already noted that 'Information is 
Physical', with 'information' understood in the everyday sense, is not supported by the 
success of quantum information theory; no more, we now see, would such a claim be 
needed for it. All that is required is the obvious statement that the devices being used 
for information processing are physical devices. Contra statement (1) and the suggestion 
of Galindo and Martfn-Delgado above, if anything more than this is meant (literally) by 
'Information is Physical' then it is irrelevant to quantum information theory. 

It is perhaps helpful to note that part of the obscurity of statements like (2) may be 
the result of their seeming to incorporate a category mistake. Another example would 
be the following statement by Landauer: 

Information is not a disembodied abs tract entity; it is always tied to a phys- 
ical representation. l)Landaueiill995 p.l88) 

To see the nature of the mistake, let us return to the example I gave in Section [1.2.31 
of writing down a message to my friend. As we noted, one distingushes between the 
sentence tokens inscribed (the collection of ink marks on the page) and what is written 
down: the propositions expressed. 

Now if 'information' refers to what is written down, inscribed, encoded, in or on 
various physical objects, or to what is conveyed by a message — as it might well be 
thought to do in statements like (2) and the Landauer quotation — then it makes no 
sense to say that information is physical. For what is written down, as opposed to the 
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collection of ink marks on the page, is not physical. This would be the category error. 
(The same argument applies if we consider information in the technical sense of what 
is produced by an information source, see Section 11.2.31 The token is physical, but the 
type belongs to a different logical category.) 

While it is true, as Landauer says, that information is not a disembodied abstract 
entity, this does not mean it is an embodied concrete entity: it is no sort of entity at all. 
'Information' is functioning as an abstract noun and hence does not refer to an entity, nor 
indeed to som e sort of substance. Talk of the necessity of a physical representation (cf. 
Steane jlPQ?! p. 5): 'no information without physical representation!') only amounts to 
(or need only amount to) the truism that if we are writing information down, or storing 
it in a computer memory then we need something to write it on, or store it in. But this 
doesn't make what is written down, or what is stored, physicaP'^. 

1.5 Alternative approaches: Dretske 

So far, little mention has been made of other philosophical discussions of the nature of 
information. Instead, we have noted some features of the everyday concept of informa- 
tion and seen ho w, in particular , this concept is distinct from the concept of information 



due to Shannon. 



Floridil l|2003|) provides a useful summary of various other approaches 



to the concepts of information to be found in the philosophical literature. 



However, there is one particular approach that we must lo ok into in 



great er detail- 



1981). Dretske 



that of Dretske in Knowledge and the Flow of Information IjDretskel 
is a proponent of semantic naturalism; and in this book he articulates a position that 
is directly opposed to the view that I have advocated regarding the significance of the 
communication-theoretic notion of information. His distinctive claim is that a satisfac- 
tory semantic concept of information is indeed to be found in information theory and 
may be achieved with a simple extension of the Shannon theory: in his view there is not 

^^Note, furthermore, that it is by no means clear that with possessing information (as opposed to 
containing it) there is any useful sense in which information finds a representation (a much over-used 
term); although, it may be the case that, as a matter of contingent fact, someone's possessing information 
supervenes on facts about their brain, nervous system and, perhaps, unrestrictedly large regions of the 
universe. 
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a significant distinction between the technical and everyday concepts of information. 

I shaU suggest, however, that Dretske fails to establish this claim. Moreover, whether 
or not his proposed semantic concept of information is in fact a satisfactory one, it enjoys 
no licit connection with Shannon's theory. 

Whilst agreeing with Shannon that the semantic aspects of information are irrelevant 
to the engineering problem, Dretske also concurs with Weaver's assessment of the con- 
verse proposition: "But this does no t mean that the engineering aspects are necesssarily 



irrelevant to the semantic aspects" IjShannon and Weaver 



19631 p. 8). Of course, if the 



engineering aspects of mechanical communication systems are relevant, though, it still 
needs to be demonstrated precisely what their relevance is. 

Dretske begins by noting that one reason why the Shannon theory does not provide 
a semantic notion of information is that it does not ascribe an amount of information 
to individual messages, yet it is to individual messages that semantic properties would 
apply. To circumvent this difhculty, he introduces the following quantity as a measure of 
the amount of information that a single event j/j , which may be a signal, carries about 
another event, or state of affairs, Xi'. 

Definition 1 (Dretske's information measure) 

Ix,{yj) = -logp(x,) - H{p{xt>\yj)), 

where Xi G {xii},i' — 1, . . . ,m;yj G {yj'},j' = 1, . . . ,n. That is, the amount of infor- 
mation that the occurrence of yj carries about the the occurrence (or obtaining) of Xi 
is given by the surprise information of Xi, minus the uncertainty (as quantified by the 
Shannon measure) in the conditional probability distribution for the Xi' events (states 
of affairs) given that yj occurred. 

From this definition of the amount of information that a single event carries, he 
moves to a definition of what information is contained in a signal S: 

Definition 2 (Dretske's information that) 



A signal S contains the information that q ~ p{q\S) =1. 
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The point of this definition is that there is to be a perfect correlation between the 
occurrence of the signal and what it is supposed to indicate: that q. 

Does this establish a link between the technical communication-theoretic notions of 
information and a semantic, everyday one? Not yet, at any rate. Whether definition ^ 
supplies a satisfactory semantic notion of information isn't to be settled by stipulation, 
but would need to be established by the successful completion of a programme of se- 
mantic naturalism demonstrating that Dretske's notion of information that is indeed an 
adequate one. We have already noted that the question of whether such an objective 
might be achieved remains open. 

However, perhaps more tellingly, there appear in any case to be major difficulties in 
the other direction — for the thought that Dretske's notion of information that has any 
genuine ties to information theory. I shall mention two main sources of difficulty, either 
of which appears on its own sufficient to frustrate the claim that there are such ties. 

In Dretske's proposal, the link to information theory is supposed to be mediated 
by definition Q of the amount of information that an individual event carries about 
another event or state of affairs. He argues that if a signal is to carry the information 
that q it must, amongst other things, carry as much information as is generated by the 
obtaining of the fact that q. 

Unfortunately, the quantity Ix^dij) cannot play the role of a measure of the amount 
of information that yj carries about Xi. To see this we need merely note that the 
surpise information associated with Xi is largely independent of the uncertainty in the 
conditional probability distribution for Xi' given yj. For example, our uncertainty in Xi' 
given yj might be very large, implying that we would learn little from yj about the value 
Xi' , yet still the amount said to be carried by yj about Xi, under Dretske's definition, 
could be arbitrarily large, if the surprise information of Xi dominates. Or again, the 
channel might be so noisy that we can learn nothing at all about Xi from yj — the two 
are uncorrelated, no information can be transmitted — yet still Ix^{yj) could be strictly 
positive and very large (if the probability of Xi is sufficiently small). This is sufficient 
to show that IxiiUj) is unacceptable as a measure. The hoped-for link to information 
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theory is snapped . 

The second main source of difficulty is that in most reahstic situations it would 
appear very difficult to specify how much information should be associated with the fact 
that q. It would be straightforward enough, perhaps, if we always had a natural fixed 
range of options to choose between, as we are supposing the set {x^'} provides, but how 
should the different options in a realistic perceptual situation, say, be counted? The 
suspicion is that typically, there will be no well-defined range of distinct possibilities. 
Dretske himself notes this problem: 

How, for example, do we calculate the amount of information generated 
by Edith's playing tennis?... [0]ne needs to know: (1) the alternative pos- 
sibilities. ..(2) the associated probabilities... (3) the conditional probabili- 
ties... Obviously, in most ordinary communication setings one knows none 
of this. It is not even very clear whether one could know it. What, after all, 
are the alternative possibilities to Edith's playing tennis? Presumably there 
are some things that are possible (e.g., Edith going to the hairdresser instead 
of playing tennis) and some things that are not possible (e.g., Edith turning 
into a tennis ball), but how does one begin to catalog these possibilities? If 
Edith might be jogging, shall we count this as one alternative possibility? 
Or shall we count it as more than one, since she could be joggin g almost 
anv\y here, at a variety of different speeds, in almost any direction? ijPretskeL 
ll98lL p.53) 

His answer is that this spells trouble only for specifying absolute amounts of information; 
and it is comparative amounts of information with which he is concerned, in particular, 
with whether a signal carries as much information as is generated by the occurrence 
of a specified event, whatever the absolute values. But this response is surely too 
phlegmatic. If the ranges of possibilities aren't well-defined, then the associated measure 
of information is not well-defined; and the difference between the two quantities will not 
then be well-defined: two wrongs don't make a right. Dretske's attempt to forge a link 
with a theory of quantity-of-information-carried seems highly doubtful. 

Of course, at this point Dretske could re-trench and argue that what he means by a 
signal carrying the same amount of information as is associated with the fact that q is 

^''Onc might try to finesse this difliculty by proposing different definitions for the amount of informa- 
tion that a single event carries about another, or more Hkely, adopt a direct criterion for when a signal 
carries 'as much' information as is generated by the obtaining of the fact that q (see below). None of the 
obvious approaches, though, suggest that the appeal to an amount of information content (and hence 
a link to a quantitative theory of information) is really anything other than a free-wheel. 
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simply that signal and fact are perfectly correlated. This would be consistent, but would 
make it very plain that the digression via a quantitative theory of how much information 
a signal contains or an event generates is superfluous. It would now just be the concept 
of perfect correlation that is operative ab initio, not anything to do with measuring 
amounts of information that a signal can contain. This contrasts with Dretske's original 
hope that the requirement of perfect correlation between a signal and what it indicates 
could be motivated or derived from constraints on how much information a signal can 
carry. As is well known, in his later work Dretske did in fact move away from conditional 
probabilities in defining his concept of information that, using the idea of perfect lawlike 
corre lation instead (althou gh for different reasons than the ones we have been dwelling on 



here (|Dretskall983lll988ri V further emphasising that concepts from information theory 
really play no genuine role in his framework. 

It thus seems that the appearance of a link between Dretske's 1981 semantic notion 
of information and information theory is illusory. No ideas that involve quantifying 
amounts of information transmitted truly play any substantive role in arriving at def- 
inition (0). This means, first of all, that Dretske's notion of information that gains no 
validation from the direction of information theory; and second, that his argument does 
not establish that there are closer ties between the communication-theoretic notion of 
information and the everyday notion than are usually admitted. 

It should be noted, finally, that care is required when considering Dretske's defini- 
tion ((2Jl (and the later statements that do not involve conditional probabilities) as a 
possible primitive notion of information that^^ . One must be aware that the definition 
may appear intuitively appealing for illegitimate reasons: as the result of the new notion 
it introduces being conflated with the idea of containing information inferentially, for 
example. With this latter notion of containing information, it is clear enough why per- 
fect correlation can have a link to information: someone who knows of the correlation 
between signal and state of affairs may learn something about the state of affairs by 

^^By 'primitive', I mean a notion of information that comes before the concepts of knowledge and 
cognitive agent and may be used to explain these latter concepts. Cf. Dretske: 'In the beginning there 
was information. The word came later. The transition was achieved by the development of organisms 
with t he capacity for selectively exploiting this information in order to survive and perpetuate their 
kind.' <Dretskelll98ll p.vii). 
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observing the signal, in virtue of an inference. However, this notion of information, con- 
taining information infcrcntially, is evidently not apt for the role of a primitive notion 
of information that^ as it relies upon the prior concept of a cognitive agent who may use 
their knowledge of the correlation to gain further knowledge. Fo r Dretske, information 



198ll p.44), but the obvious 



is 'that commodity capable of yielding knowledge' IjDretskel 
ways in which perfect correlation can yield knowledge — via an inference, or as part of a 
natural sign that may be understood or interpreted — are not available for picking out a 



primitive notion of information that, on pain of the homunculus fallacy. 



1.6 Summary 

One of the main aims of this chapter has been to emphasise the distinction between the 
everyday concept of information — with its links to knowledge, language and meaning — 
and the technical notions of information that are developed in information theory^^. 
It is not just that these technical ideas are introduced only to provide notions of an 
amount of information, but that in most cases, information theory is silent on how 
much information in the everyday sense a message might contain or convey, if any. In 
general, what is quantified in information theory is emphatically not information in the 
everyday sense of the word. 

The following table provides a summary of some of the points that have been ar- 
gued and of some of the positions that might be adopted. 'SN' stands for 'Semantic 
Naturalizer' and '^ SN' for 'non-Semantic Naturalizer'. 



^^Warnings — more or less felicitous — that one should make this distinction abound. For example, 
Weaver: \.. information must not be confused with meaning. In fact, two messages, one of which is 
heavily loaded with meaning and the other of which is pure nonsense, ca n be exactly equivalent from the 
present viewpoint as regards information.' IShannon and WeaverlFlQfi.'j . p. 8) Similarly Feynm an: '..."in- 
form ation" in our sense tells us nothing about the usefuUness or otherwise of the message. ' iFevnmanL 
ll99flL p.ll8). 
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Yes 



No 



Information 



is Physical? 



SN 



Everyday distinct 



from Technical? 



^SN, SN 



SN (early Dretske) 



Regarding the question of the physicahty of information, if we are concerned with 
information in the everyday sense, then the dispute is between proponents of semantic 
naturaHsm and others, and concerns the reducibihty (or otherwise) of semantic concepts. 
The semantic naturaUzer will assert that information is, or may be, physical, while the 
non-semantic naturalizcr will deny this. This philosophical debate remains imscttlcd, 
and I have suggested that it is a debate quantum information theory has no bearing 
upon. Equally, the outcome of the debate has no bearing on quantum information 
theory. 

Regarding the relationship between the everyday concept of information and informa- 
tion theoretic notions, both the semantic naturalizer and the non-semantic naturalizer 
can agree that these concepts are quite distinct. An attempt to naturalize semantics 
need not proceed by way of information theory; and given the very pronounced prima 
facie divergences between information theoretic notions and the everyday concept, it 
does not look a terribly promising avenue to explore. The early Dretske did attempt 
such an approach, however; and would claim that the distinction between information 
theory and the everyday notion of information may be elided. I have suggested, though, 
that this attempt to build bridges between information theory and the everyday concept 
of information is not successful. 



Chapter 2 

On a Supposed Conceptual 
Inadequacy of the Shannon 
Information in Quantum 
Mechanics 



2.1 Introduction 



In Part II of this thesis, we will be considering the implications of quantum informa- 
tion theory for the foundations of quantum mechanics. One of the topics we shall be 
investigating there is the approach of Zeilinger, who has put forward an information- 
theoretic p rinciple which he s uggests might serve as a foundational principle for quantum 



mechanics IjZeilingei 



As a part of this foundational project, 



Briikner and Zeilingen l|200l[) have criticised 



Shannon's (1948) measure of information, the quantity fundamental to the discussion 
of information in both classical and quantum information theory. They claim that the 
Shannon information is not appropriate as a measure of information in the quantum 
context and have proposed in its stead their own preferred quantity and a notion of 
'total information cont ent' associated with it, which latter is s upposed to supplant the 



von Neumann entro py Bra^e^^^^eUmgra 



The main aim in 



Brukner and Zeilinger 



1999b, 



2nnnalbD 



I 



2001) is to establish that the Shannon in- 



formation is intimately tied to classical notions, in particular, to the preconceptions of 
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classical measurement, and that in consequence it cannot serve as a measure of infor- 
mation in the quantum context. They seek to establish this in two ways. First, by 
arguing that the Shannon measure only makes sense when we can take there to be a 
pre-existing sequence of bit values in a message we are decoding, which is not the case in 
general for measurements on quantum systems (consider measurements on qubits in a 
basis different from their eigenbasis); and second, by suggesting that Shannon's famous 
third postulate, the postulate that se cures the uniqueness of the form of the Shannon 



information measure l|Shannon , 



19481) and has been seen by many as a necessary axiom 
for a measure of information, is motivated by classical preconceptions and does not apply 
in general in quantum mechanics where we must consider non-commuting observables. 

These two arguments do not succeed in showing that the Shannon information is 
'intimately tied to the notion of systems carry ing properties prior to and independent 



of observation' l|Brukner and Zeilinger 



2000bl p.l), however. The first is based on 



too narrow a conception of the meaning of the Shannon information and the second, 
primarily, on a misreading of what is known as the 'grouping axiom'. We shall see 
that the Shannon information is perfectly well defined and appropriate as a measure of 
information in the quantum context as well as in the classical fSection l2.2() . 

Brukncr and Zeilinger have a further argument against the Shannon information 
(Section I2.3|l . They suggest it is inadequate because it cannot be used to define an 
acceptable notion of 'total information content'. Equally, they insist, the von Neumann 
entropy cannot be a measure of information content for a quantum system because it has 
no general relation to information gain from the measurements that we might perform 
on a system, save in the case of measurement in the basis in which the density matrix is 
diagonal. By contrast, for a particular set of measurements, their preferred information 
measure sums to a unitarily invariant quantity that they interpret as 'information con- 
tent', this being one of their primary reasons for adopting this specific measure. This 
property will be seen to have a simple geometric explanation in the Hilbcrt-Schmidt 
representation of density operators however, rather than being of any great informa- 
tion theoretic significance; and this final argument found unpersuasive, as the proposed 
constraint on any information measure regarding the definition of 'total information 
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content' seems unreasonable. Part of the problem is that information content, total or 
otherwise, is not a univocal concept and we need to be careful to specify precisely what 
we might mean by it in any given context. 



2.2 Two arguments against the Shannon information 
2.2.1 Are pre-existing bit- values required? 

Since the quantity — X^jPi log Pi is meaningful for any (discrete) probability distribution 
p = {pi, . . . ,pn} (and can be generalised for continuous distributions), the point of 
Brukner and Zeilinger's first argument must be that when we have probabilities arising 
from measurements on quantum systems, — X^iK logK does not correspond to a concept 
of information. Their argument concerns measurements on systems that are all prepared 
in a given state {ip), where lip) may not be an eigenstate of the observable we are 
measuring. The probability distribution p for measurement outcomes will be given by 
Pi = Tr(l^) (i/>|Pi), where Pi are the operators corresponding to different measurement 
outcomes (projection operators in the spectral decomposition of the observable, for 
projective measurements). 

Brukner and Zeilinger suggest that the Shannon information has no meaning in the 
quantum case, because the concept lacks an 'operational definition' in terms of the num- 
ber of binary questions needed to specify an actual concrete sequence of outcomes. In 
general in a sequence of measurements on quantum systems, we cannot consider there 
to be a pre-existing sequence of possessed values, at least if we accep t the orthod ox 
eigenvalue-eigenstate link for the ascription of definite values (see e.g. iBubl l|1997l) )^. 
and this rules out, they insist, interpreting the Shannon measure as an amount of infor- 
mation: 



^In a footnote, Brukner and Zeilinger suggest that the Kochen-Specker theorem in particular raises 
problems for the operational definition of the Shannon information. It is not clear, however, why the 
impossibility of assigning context independent yes/no answers to questions asked of the system should 
be a problem if we are considering an operational definition. Presumably such a definition would 
include a concrete specification of the experimental situation, i.e. refer to the context, and then we are 
not concerned with assigning a value to an operator but to the outcome of a specified experimental 
procedure, and this can be done per fectly consi stently, if we so wish. The de-Broglie Bohm theory, of 
course, provides a concrete example 



■rectly consis 
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The nonexistence of well-defined bit values prior to and independent of ob- 
servation suggests that the Shannon measure, as defined by the number of 
binary questions needed to determine the particular observed sequence O's 
and I's, becomes problematic and even untenable in defining our uncertainty 
as gi ven before the measurements are performed. l|Brukner and Zeilingeil 

Hffil p.i) 

...No definite outcomes exist before measurements are performed and there- 
fore the number of different possible sequences of outcomes does not charac- 
terize our u ncertainty about the individu al system before measurements are 
performed. l|Briikner and Zeilingeil I2OOII p.3) 

These two statements should immediately worry us, however. Recall the key points of 
the interpretation of the Shannon information fSection ll.2|l : given a long message (a 
long run of experiments), we know that it will be one of the typical sequences that is 
instantiated. Given p, we can say what the typical sequences will be, how many there 
are, and hence the number of bits {NH{X)) needed to specify them, independent of 
whether or not there is a pre-existing sequence of bit values, ft is irrelevant whether 
there already is some concrete sequence of bits or not; all possible sequences that will be 
produced will require the same number of bits to specify them as any sequence produced 
will always be one of the typical sequences. It clearly makes no difference to this whether 
the probability distribution is given classically or comes from the trace rule. Also, the 
number of different possible sequences does indeed tell us about our uncertainty before 
measurement: what we know is that one of the typical sequences will be instantiated, 
what we are ignorant of is which one it will be, and we can put a measure on how 
ignorant we are simply by counting the number of different possibilities. Brukner and 
Zeilinger's attempted distinction between uncertainty before and after measurement is 
not to the point, the uncertainty is a fmiction of the probability distribution and this is 
perfectly well defined before measurement^. 

Brukner and Zeilinger have assumed that it is a necessary and sufficient condition to 
understand H as a measure of information that there exists some concrete string of N 
values, for then and only then can we talk of the minimum number of binary questions 
needed to specify the string. But as we have now seen, it is not a necessary condition 

^We may need to enter at this point the important note that the Shannon information is not supposed 
to describe our general uncertainty when we know the state, this is a job for a measure of mixedness 
such as the von Neumann entropy, see below. 
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that there exist such a sequence of outcomes. 

We are not in any case forced to assume that H is about the number of questions 
needed to specify a sequence in order to understand it as a measure of information; we 
also have the interpretations in terms of the maximum amount a message drawn from 
an ensemble described by the probability distribution p can be compressed, and as the 
expected information gain on measurement. (And as we have seen, one of these two 
interpretations must in fact be prior.) Furthermore, the absence of a pre-existing string 
need not even be a problem for the minimum average questions interpretation — we can 
ask about the minimum average number of questions that would be required if we were 
to have a sequence drawn from the ensemble. So again, the pre-existence of a definite 
string of values is not a necessary condition. 

It is not a sufficient condition either, because, faced with a string of N definite 
outcomes, in order to interpret NH as the minimum average number of questions needed 
to specify the sequence, we need to know that we in fact have a typical sequence, that is, 
we need to imagine an ensemble of such typical sequences and furthermore, to assume 
that the relative frequencies of each of the outcomes in our actual string is representative 
of the probabilities of each of the outcomes in the notional ensemble from which the 
sequence is drawn. If we do not make this assumption, then the minimum number of 
questions needed to specify the state of the sequence must be A'' — we cannot imagine 
that the statistical nature of the source from which the sequence is notionally drawn 
allows us to compress the message. So even in the classical case, the concrete sequence 
on its own is not enough and we need to consider an ensemble, either of typical sequences 
or an ensemble from which the concrete sequence is drawn. In this respect the quantum 
and classical cases are completely on a par. The same assumption needs to be made 
in both cases, namely, that the probability distribution p, either known in advance, or 
derived from observed relative frequencies, correctly describes the probabilities of the 
different possible outcomes. The fact that no determinate sequence of outcomes exists 
before measurement does not pose any problems for the Shannon information in the 
quantum context. 

Reiterating their requirements for a satisfactory notion of information, Brukner and 
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Zeilinger say: 

We require that the information gain be directly based on the observed 
probabilities, (and not, for example, on the precise sequence of individual 
outcomes observed on which S hannon's measure of information is based). 
l|Brukner and ZeilintreR l20n0bL p.l) 

But as we have seen, it is false that the Shannon measure must be based on a precise 
sequence of outcomes (this is not a necessary condition) and the Shannon measure 
already is and must be based on the observed probabilities (a sequence of individual 
outcomes on its own is not sufficient). 

There is, however, a difference between the quantum and classical cases that Brukner 
and Zeilinger may be attempting to capture. Suppose we have a sequence of N qubits 
that has actually been used to encode some information, that is, the sequence of qubits 
is a channel to which we have connected a classical information source. For simplicity, 
imagine that we have coded in orthogonal states. Then the state of the sequence of 
qubits will be a product of |0)'s and |l)'s and for measurements in the encoding basis, 
the sequence will have a Shannon information equal to NH{A) where H{A) is the infor- 
mation of the classical source. If we do not measure in the encoding basis, however, the 
sequence of O's and I's we get as our outcomes will differ from the values originally en- 
coded and the Shannon information of the resulting sequence will be greater than that of 
the original. We have introduced some 'noise' by measuring in the wrong basis. As w e 



Scbimachei 



have seen, however, the way we describe this sort of situation (e.g 
is to use the Shannon mutual information H{A : B) = H{A) — H{A\B), where B de- 
notes the outcome of measurement of the chosen observable (outcomes bi with probabili- 
ties p{bi)) and the 'conditional entropy' H{A\B) = J27=iP(^i)^iPi^i\^i)^ ■ ■ ■ j P(.'^m\bi)) , 
characterises the noise we have introduced by measuring in the wrong basis. H{B) is 
the information (per letter) of the sequence that we are left with after measurement, 
H{A : B) tells us the amount of information that we have actually managed to transmit 

^We may think of our initial sequence of qubits as forming an ensemble described by the density 
operator p = pi|0)(0| -|-p2|l)(l|, where pi,P2 are the probabihties for and 1 in our original classical 
information source. Any (projective) measurement that does not project onto the eigcnbasis of p will 
result in a post- measurement ensemble that is more mixed than p (see e.g. Nielsen ('2001); Peres 1 199^ 
and below) and hence will have a greater uncertainty, thus a greater Shannon information, or any other 
measure of information gain. 
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Figure 2.1: The probability distribution p= i} can be considered as given for the 

three outcomes directly, or we could consider first a choice of two equiprobable events, 
followed by a second choice of two events with probabilities §,5, conditional on the 
second, say, of the first two events occurring, a 'decomposition' of a single choice into 
two successive choices, the latter of which will only be made half the time. Shannon's 
third requirement says that the uncertainty in p will be given hy H{^,^,^) = H{^,^) + 
i): the uncertainty of the overall choice is equal to the uncertainty of the first 
stage of the choice, plus the uncertainty of the second choice weighted by its probability 
of occurrence. 

down our channel, i.e. the amount (per letter) that can be decoded when we measure 
in the wrong basis. 

2.2.2 The grouping axiom 

The first argument has not revealed any difficulties for the Shannon information in the 
quantum context, so let us now turn to the second. 

In his original paper, Shannon put forward three properties as reasonable require- 
ments on a measure of uncertainty and showed that the only function satisfying these 
requirements has the form H = —K ^^^pilogpi* 

The first two requirements are that H should be continuous in the pi and that for 
equiprobable events {pi = ^/n), H should be a monotonic increasing function of n. The 
third requirement is the strongest and the most important in the uniqueness proof. It 
states that if a choice is broken down into two successive choices, the original H should 
be a weighted sum of the individual values of H. The meaning of this rather non- 
intuitive constraint is usually demonstrated with an example (see Fig. 12.111 . A precise 
statement of Shannon's third requirement (one that includes also the second requirement 

^In contrast to some later writers, however, notably I.Tavnej he set little store by t his deriva- 

tion, seeing the justification of his measure as lying rather in its implications lShanno'i] . ll948r) . Save the 
noiseless coding theorem, the most significant of the implications that Shannon goes on to draw are, as 
has been pointed out by Uffink, consequences of the property of Schur concavity and hence shared by 
the general class of measures of uncertainty derived in iUftink I igQCtl . 
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and is 



as a special case) is due to 
axiom: 



Faddeev 



often known as the Faddeev grouping 



Grouping Axiom 1 (Faddeev) For every n > 2 

H{pi,p2, . . . ,Pn-i,qi,q2) = H{pi, . . . ,p„_i,p„) — , — ) (2.1) 

Pn Pn 

where p„ = qi + q2- 

The form of the Shannon information follows uniquely from requiring H{p,l — p) to be 
continuous for Q < p < 1 and positive for at least one value of p, permutation invariance 
of H with respect to relabelling of the pi , and the grouping axiom. 



' Group i ng axioni' is a n appropriate n ame. As it is standardly understood (see e.g. 
jlflfli : .TavrieF (m3)), 



AghldM^; 



Uffirikl (|1 



we consider that instead of giving the prob- 
abilities of the outcomes xi, . . . ,Xn of a probabilistic experiment directly, 
we may imagine grouping the outcomes into composite events (whose probabilities will 
be given by the sum of the probabilities of their respective component events), and 
then specifying the probabilities of the outcome events conditional on the occurrence 
of the composite events to which they belong; this way of specifying the probabilis- 
tic experiment being precisely equivalent to the first. So we might group the first k 
events together into an event A, which would have a probability p{A) — X]i=i-P*i ^'^'^ 
the remaining n — k into an event B of probability p{B) = X]"=fe+i-Pii ^^'^ then give 
the conditional probabilities of the events xi, . . . ,Xk conditional on composite event A 
occurring, (jJi/p{A)), . . . , (pk / p{A)) , and similarly the conditional probabilities for the 
events Xk+i, . . . ,Xn conditional on event B. The grouping axiom then concerns how the 
uncertainty measures should be related for these different descriptions of the same prob- 
abilistic experiment. It says that our uncertainty about which event will occur should 
be equal to our uncertainty about which group it will belong to plus the expected value 
of the uncertainty that would remain if we were to know which group it belonged to 
(this expected value being the weighted sum of the uncertainties of the conditional dis- 
tributions, with weights given by the probability of the outcome lying within a given 
group). 
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So in particular, let us imagine an experiment with n + 1 outcomes which we la- 
bel fli, 02, . . . , a„_i, 6i, 62, having probabilities pi, . ■ . ,Pn~i, Qi, 92 respectively. We can 
define an event a„ = 61 U 62, &i H 62 — 0, which would have probability Pn = Qi + Q2 
and the probabilities for 61 and &2 conditional on a„ occurring will then be ^ , ^ re- 
spectively. Grouping Axiom ^ says that the uncertainty in the occurrence of events 
fli, a2, . . . , a„_i, 5i, 62 is equal to the uncertainty for the occurrence of events ai, . . . , a„ 
plus the uncertainty for the occurrence of 61,62 conditional on a„ occurring, weighted 
by the probability that a„ should occur. 

Brukner and Zeilinger suggest that the grouping axiom, however, embodies certain 
classical presumptions that do not apply in quantum mechanics. This entails that the 
axiomatic derivation of the form of the Shannon measure does not go through and that 
the Shannon information ceases to be a measure of uncertainty in the quantum context. 
The argument turns on their interpretation of the grouping axiom, which differs from 
the standard interpretation in that it refers to joint experiments. 

Brukner and Zeilinger's interpretation 

If we take an experiment, A, with outcomes ai,...,a„ and probabilities 
(p(ai), . . . ,p(a„)) — (pi, . . . ,Pn) and an experiment, B, with outcomes 61, 62, then for 
the joint experiment A A B, the event a„ is the union of the two disjoint events a„ A 61 
and a„ A 62. Let us assign to these two events the probabilities qi and 52 respectively. 
Then p(a„) = p(a„ A 61) +p(a„ A 62) = qi + q2 = Pn- On this interpretation, the left 
hand side of Grouping Axiom ^ is to be understood as denoting the uncertainty in the 
experiment with outcomes oi, a2, . . . , fln-i, o,„ A 61, a„ A 62. 

If a„ occurs, the conditional probabilities for 61,62 will be p(a„ A 6i)/p(a„) = 
qi/Pn,pian A 62)/p(a„) = q2/Pn respectively, and so H{^,^) is the uncertainty in 
the value of B given that a„ occurs. 

The grouping axiom can now be rewritten as: 
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Grouping Axiom 2 (Brukner and Zeilinger) 

H {p{ai),p{a2), . . ■ ,_p(a„„i),p(a„ A 6i),p(a„ A 62)) 

= H {p{ai),p{a2), ■ . . ,p(a„)) + p{an)H (p(6i|a„),p(&2|an)) • (2.2) 

Generalizing to the case in wliicli we have m outcomes for experiment B and distinguish 
B values for all n A outcomes, so that we have mn outcomes A bj , the grouping axiom 
becomes: 

Grouping Axiom 3 (Brukner and Zeilinger) 

H{A AB)^ H{A) + H{B\A) 

From the point of view of Shannon's original presentation, this expression appears as 
a theorem rather than an axiom, being a consequence of the logarithmic form of the 
Shannon information and the definition of the conditional entropy. 

The inapplicability argument 

The classical assumptions made explicit, Brukner and Zeilinger suggest, in Grouping 
Axioms El and 13 are that attributes corresponding to all possible measurements can be 
assigned to a system simultaneously (in this case, a^, hj and A and that measure- 
ments can be made ideally non-disturbing. Grouping Axiom|3 for example, is supposed 
to express the fact that classically, the information we expect to gain from a joint ex- 
periment A/\B^ is the same as the information we expect to gain from first performing 
A, then performing B (where the uncertainty in B is updated conditional on the A 
outcome, but our ability to predict B outcomes is not degraded by the A measurement). 

Their inapplicability argument is simply that as the grouping axiom requires us 
to consider joint experiments, the uniqueness proof for the Shannon information will 
fail in the quantum context, because we can consider measurements of non-commuting 
observables and the joint probabilities on the left hand side of Grouping Axiom|21will not 
be defined for such observables; thus the grouping axiom will fail to hold. Furthermore, 
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the grouping axiom shows that the Shannon information embodies classical assumptions, 
so the Shannon measure will not be justified as a measure of uncertainty because these 
assumptions do not hold in the quantum case. The result is that 

...only for the special case of commuting, i.e., simultaneously definite observ- 
ables, is the Shannon measure of information applicable and the use of the 
Shannon information justified to d p.ftne the uncertainty given be fore quan- 



bhannon information justifiea to d eMne me uncertainty given oe tore quan- 
tum measurements are performed. IjBrukner and Zeilingeil 1200 iL p. 4), my 
emphasis. 

This argument is problematic, however. Let us begin with the obvious point that 
a failure of the argument for uniqueness does not automatically rule out the Shannon 
information as a measure of uncertainty. In fact, the Shannon information can be seen as 
one of a general class of measures of unc ertainty, char acterised by a set of axioms in which 



the grouping axiom does not appear ijUffink . 



1990f) . hence the grouping axiom is not 



necessary for the interpretation of the Shannon information as a measure of uncertainty. 
(Ufhnk in fact has previously argued that the grouping axiom is not a natural constraint 
on a measure of information and should not be imposed as a necessary constraint, even in 



the classical case ( Uffink 



19901 §1.6.3).) So from the fact that on the Brukner/Zeilinger 
reading, the grouping axiom seems to embody some classical assumptions that do not 
hold in the quantum case, it does not follow that the concept of the Shannon information 
as a measure of uncertainty involves those classical assumptions. 

Furthermore, Brukner and Zeilinger's grouping axiom is not in fact equivalent to 
the standard form and the standard form is equally applicable in both the classical 
and quantum cases. Thus the Shannon information has not been shown to involve 
classical assumptions and the standard axiomatic derivation can indeed go through in 
the quantum context. The probabilities appearing in Grouping Axiom^are well defined 
in both the classical and quantum cases. 

In Brukner and Zeilinger's notation. Grouping Axiom ^ would be written as 



H{p{ai),p{a2), ■ ■ ■ ,p{an-i),p{bi),p{b2)) 

= H{p{ai), . . . ,p(a„_i),p(&i V 62)) 

+ p{bi V b2)H{p{bi\bi V 62),p(&2|6i V 62)) (2.3) 
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0.2 /\bj) = 02 



Figure 2.2: a) in Grouping Axioms] and eqn. 1)2. the 'i?' outcomes 6i and 62 cannot 

occur without a„ '= hi U &2 occurring, b) in the joint experiment scenario, hi or 62 can 
occur without a„ occurring, but this is made to appear the same by coarse graining the 
joint experiment and only recording B values when we get A outcome a„. 



This refers to an experiment with n + 1 outcomes labelled by ai, . . .a„_i,6i,62 and 
the grouping of two of these outcomes together, and is clearly different from Grouping 
Axiom 12 (see Fig. I2.2|l . Brukner and Zeilinger's Grouping Axiom |5| is the result of 
applying (|2.3|l to a coarse grained joint experiment in which we only distinguish the B 
outcomes of ^ A i? for A outcome a„, and Grouping Axiom |31 is the result of applying 
the simple grouping axiom to the fine grained joint experiment n times. These are 
not, then, expressions of the grouping axiom, but rather demonstrate its effect when 
actually applied to an already well defined joint probability distribution. The absence 
of certain joint probability distributions in quantum mechanics does not, however, affect 
the meaningfulness of the grouping axiom, because in its proper formulation it does not 
refer to joint experiments^ . (Note that if the outcomes of the experiment in eqn. (|2.3(l 
were represented by one dimensional projection operators, the event bi V 62 would be 
represented by a sum of orthogonal projectors which commutes with the remaining 
projectors; a similar rel ation (coexis t ence) holds if the outcomes are represented by 



POV elements [effects), l|Busch et al. 



Thus we see that the explicit argument fails. There may remain, however, a certain 
intuitive one. Brukner and Zcilinger are perhaps suggesting that we miss something 

^To see that the standard case and the joint experiment case are mathematically distinct, note that 
the joint experiment formalism cannot express the situation in which B events only happen if the a„ 
event occurs. For that we would require that p(a„) = p(bi) +^(62), but then the marginal distribution 
for B outcomes in the joint experiment does not sum to unity as is required for a well defined joint 
experiment, YliPi^i) — p{o.n) ^ 1 in general. 
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importantly quantum by using the Shannon information as its derivation is restricted 
to the case of commuting observables. Or rather (since we have seen that the grouping 
axiom does not expHcitly refer to joint experiments and we know that it is not in fact 
necessary for the functioning of the Shannon information as a measure of uncertainty 
anyway), because it can only tell us about the uncertainty in joint experiments for 
mutually commuting observables rather than the full set of observables. However, when 
we recall that a measure of uncertainty is a measure of the spread of a probability 
distribution, we see that this simply amounts to the truism that one cannot have a 
measure of the spread of a joint probability distribution unless one actually has a joint 
probability distribution, and it in no way implies that there is anything un-quantum 
about the Shannon information itself. 

The fact that joint probability distributions cannot be defined for all possible group- 
ings of experiments that we might consider does not tell us anything about whether a 
certain quantity is a good or bad measure of uncertainty for probability distributions 
that can be defined (e.g. any probability distribution derived from a quantum state 
by the trace rule, or conditional probabilities given by the Liiders rule). We must be 
careful not to confuse the question of what makes a good measure of uncertainty with 
the question of when a joint probability distribution can be defined. 

We already know that a function of a joint probability distribution cannot be a 
way of telling us how much wc know, or how uncertain we are, in general when we 
know the state of a system, because we know that a joint probability distribution for 
all possible measurements does not exist. It is for this reason in quantum mechanics 
that we introduce measures of mixedness such as the von Neumann entropy, which are 
functions of the state rather than of a probability distribution. It is not a failing of the 
Shannon information as a measure of uncertainty or expected information gain that it 
does not fulfil the same role. Part of Brukner and Zeilinger's worry about the Shannon 
information thus seems to arise because they are trying to treat it too much like a 
measure of mixedness, a measure of how uncertain we are in general when we know the 
state of a quantum system^. 

^ This is illustrated for example in their re ply to criticism of their grouping axiom argument by 
Hall iBrukner and Zeilingeit l2000bHHaIl 120001) . Hall presents an interpretation of the grouping axiom 
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2.3 Brukner and Zeilinger's 'Total information con- 
tent' 

The final argument proposed against the Shannon information is that it is not appropri- 
ately related to a notion of 'total information content' for a quantum system. It is also 
suggested that the von Neumann entropy, which has a natural relation to the Shannon 
information, is not a measure of information con tent as it makes no explicit reference to 



information gain from measurements in general IjBrukner and Zeilingei 



2001 . 



2000h). 



In place of the Shannon information, Brukner and Zeilinger propose the quantity 



from which they derive their notion of total information content as follows: 



(2.4) 



A set of measurements is called mutually unbiased or complementary (jSchwinger 



1960|) if the sets of projectors {P}, {Q} associated with any pair of measurement bases 
satisfy Tt{PQ) = 1/n, w here n is the dimensionalit y of the system. There can exist 



at most n + 1 su ch bases llWootters and Fieldi 
as was shown by 



1989fl . constituting a complete set, and 



Ivanovid l|1981|) . measurement of such a complete set on an ensemble 



of similarly prepared systems determines their density matrix p completely. In analogy 
to acquiring the information associated with a (pointlike) classical system by learning 
its state (determining its position in phase space), Brukner and Zeilinger then suggest 
that the total information content of a quantum system should be given by a sum of 
information measures for a complete set of mutually unbiased measurements. Taking 

concerning the increase in randomness on mixing of non-overlapping distributions, to which Brukner 
and Zeilinger's worries about joint experiments would not apply. Their reply, in essence, is that the 
density matrix cannot be simultaneously diagonal in non-commuting bases, therefore it cannot be 
thought to be composed of non-overlapping classical distributions, hence Hall's grouping axiom will not 
apply, further supporting their original claim that the Shannon measure is tied to the notion of classical 
properties. What this reply in fact establishes, however, is that Hall's axiom applied to mixtures of 
classical distributions is not relevant to characterising the randomness of the density matrix; but this 
is something with which everyone would agree, and this job certainly not one for which the Shannon 
information is intended. (If we did wish to use the grouping axiom in characterising the randomness 
of the density matrix, we would apply Hall's version to mixtures of density operators with orthogonal 
support; this would then pick out the von Neumann entropy up to a constant factor (■WchrL.197?^ .'> 
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I{p) as the measure of information, the resuh is a unitarily invariant quantity: 



Itot = J:I{p^) = E (pi - = - ■ (2.5) 



j = l ji 

This also provides their argument against the Shannon information. It is a necessary 
constraint on a measure of total information content, they argue, that it be unitarily 
invariant, but substituting H{p) for I{p) in H2.5I) does not result in a unitarily invariant 
quantity, that is, we do not have a sum to a 'total information content'. Let us call the 
requirement that a measure sum to a unitarily invariant quantity for a complete set of 
mutually unbiased measurements the 'total information constraint'. The suggestion is 
that the Shannon measure is inadequate as a measure of information gain because it does 
not satisfy the total information constraint and hence does not tell us how much of the 
total information content of a system we learn by performing measurements in a given 
basis. Similarly, the complaint against the von Neumann entropy is that it is merely a 
measure of mixedness, as unlike Itot, it has no relation to the information gained in a 
measurement unless we happen to measure in the eigenbasis of p. 

A few remarks are in order. First, I{p) and Itot are not unfamiliar expressions. The 
quantity J^iiPi ~ V'^)^ is one of the clas s of r neasures of t he concentration of a proba- 



TTffinkl and 



Fand l|1957fl . for example, remarks that 



bility distribution given by 
Tr(p^) can serve as a good measure of inform ation; fu r therrn ore, the relation expressed 



Larsen 



199(1) in discussing exact uncer- 



in eqn. (|2.5() has previously been employed by 
tainty relations. Note also that I{p) is an increasing function of the concentration of 
a probability distribution, hence a measure of how much we know given a probability 
distribution, rather than being a measure of uncertainty like H(j]); similarly Itot is an 
increasing function of the purity of p. 

More importantly, however, 'information content' might mean several different things. 
It may not, then, be reasonable to require that every meaningful information measure 
sum to a unitarily invariant quantity that can be interpreted as an information content. 
Moreover, we may well ask why information measures for a complete set of mutually 
unbiased measurements should be expected to sum to any particularly interesting quan- 
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tity. That the measure /(p) happens to sum to a unitarily invariant quantity is, as we 
shall presently see, the consequence of a geometric property tangential to its role as a 
measure of information. 

2.3.1 Some Different Notions of Information Content 

It is useful to distinguish between the information encoded in a system, the information 
required to specify the state of a system (more precisely, the information required to 
specify a sequence of states drawn from a given ensemble) and states of complete and 
less than complete knowledge or information. Each of these can serve as a notion of 
information content in an appropriate context. In the classical case, their differences 
can be largely ignored, but in the quantum case there are important divergences. As 
we saw in the last chapter, it is necessary, for instance, to introduce the concept of the 
accessible information to characterize the difference between information encoded and 
specification information. 

If we consider encoding the outputs of a classical information source A into pure 
states \ai) of an ensemble of quantum systems, then the state of the ensemble will 
be given by p = X^jf>(a,)|a,)(a,|. The von Neumann entropy, S{p) = — Trplogp, is a 
measure of how mixed this state is, giving us one sense of information content — the more 
mixed a state, the less information we have about what the outcome of measurements 
on systems described by the state will be''. 

If we are presented with a sequence of systems drawn from an ensemble prepared in 
this manner, each will be in one of the pure states, and the number of bits per system 
required to specify this sequence will be H{A), the information of the classical source 
(which will be greater than S{p) unless we have coded in orthogonal states). As we 
know, this is the specification information, also the amount of information required to 
prepare the sequence. For the amount of information that has actually been encoded 
into the systems, however, we need to consider measurements on the ensemble and the 

^Mixcd states arc also sometimes said to be states of less than complete information due to a lack of 
information about the way a system was prepared, represented by a probability distibution over possible 
pure states. Our reading is to be preferred given the many-one relation of preparation procedures to 
density operators and the fact that density operators can also result from tracing out unwanted degrees 
of freedom. 
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Shannon mutual information H{A : B). 

As already remarked rSection l2.2.1|l . if we encode using a certain basis (our \ai) form 
an orthonormal set) and we measure in a different basis, then H{A : B) < H(A); quan- 
tum 'noise' reduces the amount we can decode. More significantly, if we have coded in 
non-orthogonal states then no measurement can distinguish these states perfectly and 
we cannot recover the complete classical information. The maximum amount of infor- 
mation encoded in a system is given by the accessible information (the maximum over 
all decoding observables of the mutual information) and using non-orthogonal coding 
states, the amount we ca n encode is les s than the specification information. As we have 
seen, the Holevo bound ljHolevo[ Il97.1^ provides an upper bound on the mutual infor- 
mation resulting from measurement of any observable (including POV measurements). 
For the case we are considering of pure encoding states, this reduces to 

H{A : B) < S{p). 

This provides a very strong sense in which the von Neumann entropy does give us 
a notion of the total information content of a quantum system — it is the maximum 
amount that can actually be encoded in the system. 

Brukner and Zeilinger do not consider a quantum communication channel but are 
concerned rather with the information content of a single system considered in isolation. 
This information content is supposed to relate to how much we learn from learning the 
state, but if the system is being treated in isolation then by learning its state we are 
not acquiring a certain amount of information in virtue of the state being drawn from 
a given ensemble, as in the standard notion of information. (Hence their analogy with 
gaining the information content of a classical system fails to hold.) In fact, their 'total 
information content' seems best interpreted as a measure of mixedness analogous to the 
von Neumann entropy . 



When introduced l|Brukner and Zeilinerei 



1999b|^ . the information measure /(p) is 



presented as a measure of how much we know about what the outcome of a particular 
experiment will be, given the state. The total information of the state, then, would 
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seem to be a measure of how much we know in general about what the outcomes of 
experiments will be given the state; and this is precisely a question of the degree of 
mixedness of the state®. 



Measures of mixedness 



The functioning of measures of mixedness can usefully be approached via the notions 
of majorization and Schur convexity (con cavity). The niaiorization re lation -< imposes 



a pre-order on probability distributions (jUSink . 



Nielsen . 



2n0l|) . A probability 



distribution qis majorized hy p, q ^ p, iff Qi — ^ijPj^ where Sij is a doubly stochastic 
matrix. That is (via Birkhoff 's theorem) , if (f is a mixture of permutations of p. Thus if 
q ^ p, then qis a more mixed or disordered distribution than p. 

Schur convex (concave) functions respect the ordering of the majorization relation: 
a function / is Schur convex if, if q < p then f(q) < f{p}, and Schur concave if, if 
q < p then f[q) > f{p) (for strictly Schur convex(cave) functions, equality holds only if 
q and p are permutations of one another) . This explains the utility of such functions as 
measures of the concentration and uncertainty of probability distributions, respectively. 

The majorization relation will apply equally to the vectors of eigenvalues of density 
matrices. It can be shown that the vector of eigenvalues A' of the density matrix p' of the 
post measurement ensemble for a (non-selective) projective me asurement is m ajorized 



20011) . (If we 



by the vector of eigenvalues A of the pre-measurement state p l|NielsenL 
measure in the eigenbasis of p, then there is, of course, no change in the eigenvalues). The 
A^ are just the probabilities of the different outcomes of the measurement in question, 
thus the probability distribution for the outcomes of any given measurement will be 
more disordered or spread than the eigenvalues of p. 

If we take any Schur concave function we know to be a measure of uncertainty, for 
instance the Shannon information H{p), and p is the probability distribution for mea- 

* Recently it has been noted that I tot is also related to the average distance of our estimate of 
the unknown state from the true state (measured in the Hi lbert-Schmidt norm) , given only a finite 
number N of experiments in each mutually unbiased basis jRehaceJk and Hradil 120021) . This seems 
best understood as indicating that the mean error in our state estimation is inversely related to N, with 
a constant of proportionality that depends on the dimension of the system and the mixedness of the 
state. In any case, Brukner and Zeilinger are primarily interested in how much we know when the state 
has been determined to arbitrary accuracy. 
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surement outcomes, we then know that H{p) > H{X), for any projective measurement 
we might perform. This explains why H{X) — S{p), a measure of mixedness, is a mea- 
sure of how much we know given the state: the more mixed a state, the more uncertain 
we must be about the outcome of any given measurement. Similarly, if we take a mea- 
sure of the concentration of a probability distribution, a Schur convex function such as 
I{p), then we know that for any measurement with outcome probability distribution p, 
Hp) ^ H^) ~ itou and this explains why Itot is a measure of how much we know given 
p: the less the value of Itot, the less able we are to predict the outcome of any given 
experiment. Note, however, that it is the structure that is imposed by the majorization 
relation that is of underlying importance. Choosing a particular measure -ff(p), ^(p), or 
any other member Ur (p) of Uffink's general class of measures of uncertainty, is simply 
a matter of choosing by convention a numerical measure to lay on top of this structure, 
for convenience. 

Brukner and Zeilinger would of course deny that their total information content 
is merely a measure of mixedness. The argument that it is more than this rests on 
the satisfaction of the total information constraint, the relation between the measure 
of information I{p) and Itot for a complete set of mutually unbiased measurements as 
expressed in eqn. H2.5|l . We shall now see that this relation can be given a simple 
geometric explanation using the Hilbert-Schmidt representation of density operators. 



2.3.2 The Relation between Total Information Content and I{p) 



The set of complex nxn Hermitian matrices forms an rt^-dimensional real Hilbert space 
T4(C") on which we have def ined an inner product (A.B) = Tr{AB);A,B e 14 (C") 



and a norm ||yl|| = ^Tr(A2) ijFano . 



1957 



WichmanE , 



1963) • The density matrix p of 



an n dimensional quantum system can be represented as a vector in this space. The 
requirements on p of unit trace and positivity imply that the tip of any such vector 
must lie in the — 1 dimensional hyperplane T a distance 1 / ^/n from the origin and 
perpendicular to the unit operator 1, and on or within a hypersphere of radius one 
centred on the origin. 

It is useful to introduce a set of basis operators on our space; we require linearly 
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independent operators Fj e V/,(C") and it may be useful to require orthogonality and a 
fixed norm: Tr(rjrj) = const, x Sij. Any operator on the system can then be expanded 
in terms of this basis and in particular, p can be written as a vector 

e = - + E Tr(pr,)r„ 
1=1 

where we have chosen Fg = 1 to take care of the trace condition. A very familiar example 
of this formalism is the Bloch sphere for two-dimensional quantum systems. Here the 
basis operators Fj are given by the three Pauli matrices ct,. 

Evidently, p may be determined experimentally by finding the expectation values of 
the — 1 operators Fj in the state p. If we include the operator 1 in our basis set, 
then the one- dimensional projectors associated with measurement of any maximal (non- 
degenerate) observable will provide a maximum of a further n — 1 linearly independent 
operators. Obtaining the probability distribution for a given maximal observable will 
thus provide n — 1 of the parameters required to determine the state, and the minimum 
number of measurements of maximal obscrvablcs that will be needed in total is n + 1, if 
each observable provides a full complement of linearly independent projectors. 

Each such set of projectors spans an n—1 dimensional hyperplane in 14 (C") and their 
expectation values specify the projection of the state p into this hyperplane. Ivanovic 
(1981) noted that projectors P, Q belonging to any two different mutually unbiased 
bases will be orthogonal in T, hence the hyperplanes associated with measurement of 
mutually unbiased observables are orthogonal in the space in which density operators are 
constrained to lie in virtue of the trace condition. If n-|- 1 mutually unbiased observables 
can be found, then, Vh{C") can be decomposed into orthogonal subspaces given by the 
one dimensional subspace spanned by 1 and the n + 1 subspaces associated with the 
mutually unbiased observables. The state p can then be expressed as: 

^ n+l n 

S=- + EE'^P^' (2-6) 

where P- = P- — 1/n is the projection onto T of the ith one- dimensional projector in 



CHAPTER 2. INADEQUACY OF SHANNON INFORMATION IN QM? 61 

the jth mutually unbiased basis set, and qf = [pj — l/n) is the expectation value of this 
operator in the state p. For a given value of j, the vectors Pi span an (n — 1) dimensional 
orthogonal subspace and the square of the length of a vector expressed in the form H2.6|l 
lying in subspace j will be given by J27=iili)'^ ~ ^iP'')- 

The geometrical explanation of Itot is then simply as follows. Tr(p^) is the square of 
the length of g in Vft (C") and Tr(p — 1/n)'^ is the square of the distance of p from the 
maximally mixed state (the length squared of p in T). This squared length will just be 
the sum of the squares of the lengths of the components of the vector p — l/n in the 
orthogonal subspaces into which we have decomposed Vh(C^^), i.e. it will be given by 
This is what eqn. H2.5|l reports and it explains how Itot and I{p) satisfy the 
total information constraint. 

Thus we see that if Itot differs from being a simple measure of mixedness, then that 
is because it is a measure of length also; and this explains why it can be given by a sum 
of quantities I{p) for a complete set of mutually unbiased measurements. As measures 
of how much we know given the state, however, Itot and S{p) bear the same relation to 
their appropriate measures of information, as we saw in the previous section. Equally, 
as measures of information, H{p) stands to S{p) in the same relation as I{p) to Itot- hot 
is the upper bound on the amount we can know about the outcome of a measurement as 
measured by I{p)] S{p) is the lower bound on our uncertainty about what the outcome 
of a measurement will be, as measured by H{p). 

The complaint against the Shannon information was supposed to be that as H{p) fails 
to satisfy the total information constraint, it does not tell us the information gained from 
a particular measurement; the complaint against the von Neumann entropy that as S{p) 
is not given by a sum of measures for a complete set of mutually unbiased measurements, 
it is not suitably related to the information gained from a measurement. However, we 
can now see that insisting on the total information constraint in this way is tantamount 
to insisting that only a function which measures the length of the component of p lying 
in a given hyperplane can be a measure of information, and correlatively, that the only 
viable notion of total information content is a measure of the length of p in V;i(C"). But 
H{p) can be a perfectly good measure of information without having to be a measure of 
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the length of the projection of p into the subspace associated with an observable; and 
as we have just seen, S{p) does have an explicit relation to the information gain from 
measurement that justifies its interpretation as a total information content. A relation, 
moreover, that Itot also possesses and which serves to justify its interpretation as a 
measure of how much we know given the state. 

Hence our conclusion must be that the total information constraint is not a reasonable 
requirement on measures of information. 

Of course, H{p) does not tell us the information gain on measurement it we take, 
as Brukner and Zeilinger seem to, 'the information encoded in a basis' simply to mean 
the length squared of the component of the state lying in the measurement hyperplane; 
but this is a non-standard usage. H{p) certainly remains a measure of our expected 
information gain from performing a particular measurement (how much the outcome 
will surprise us, on average, given that we have the probability distribution); and if we 
are interested in the amount of information encoded, in the usual sense of the word, 
that can be decoded using a particular measurement, i.e., if we have a string of systems 
into which information has actually been encoded, then we may always just consider the 
Shannon mutual information associated with that measurement. (The 'total informa- 
tion' associated with this quantity will then be given, via the Holevo bound, as the von 
Neumann entropy, for pure encoding states.) 

Perhaps the fundamental error in this final argument of Brukner and Zeilinger is 
their failure to appreciate that the choice of measure of information one should adopt is 
a largely conventional matter, depending on what one's aims are and accordingly, which 
measure is most convenient. As such, trying to use the total information constraint to 
rule certain measures out as incorrect is simply mistaken. The geometrical property 
that the measures I{p) and Itot possess is indeed a nice one that will be useful in certain 
contexts, for ex ample, if one w ishes to provide certain exact uncertainty relations instead 
of inequalitites IjLarsenLllQQfl) . But this just serves to make the point that it should be 
horses for courses. As we have seen, the Shannon information and von Neumann entropy 
have multiple important and central uses as measures of information in the quantum 
context. 
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Of the three arguments that Brukner and Zeihnger have presented against the Shannon 
information, we have seen that the first two fail outright. These arguments sought to 
estabhsh that the notion of the Shannon information is undermined in the quantum 
context due to a rehance on classical concepts. With regard to the first we saw that, 
contrary to Brukner and Zeilinger, the existence of a pre-determined string is neither 
necessary nor sufficient for the interpretation of H{p) as a measure of information, hence 
the absence of such a string would not cause any problems for the Shannon information 
in quantum mechanics. 

The objective of their second argument was to highlight classical assumptions in the 
grouping axiom that would prevent the axiomatic derivation of the Shannon information 
going through in the quantum case. This argument turned out to be based on an 
erroneous reading of the grouping axiom that appeals to joint experiments. The grouping 
axiom is in fact perfectly well defined in the quantum case and the standard axiomatic 
derivation of the form of H{'p) can indeed go through. The grouping axiom does not 
reveal any problematic classical assumptions implicit in the Shannon information. 

In their final argument, Brukner and Zeilinger suggest that defining the notion of 
the total information content of a quantum system in terms of the Shannon information 
would lead to a quantity with the unnatural property of unitary non-invariance. But this 
is not a compelling argument against the Shannon quantity as a measure of information. 
We have seen that it is not a necessary requirement on every meaningful measure of 
information that it sum to a unitarily invariant quantity for a complete set of mutually 
unbiased measurements; nor, conversely, is it necessary that every viable notion of total 
information content be given by such a sum of individual measures of information. 

Brukner and Zeilinger's arguments thus fail to establish that the Shannon informa- 
tion involves any particularly classical assumptions or that there is any difficulty in the 
application of the Shannon measure to measurements on quantum systems. The Shan- 
non information is perfectly well defined and appropriate as a measure of information 
in the quantum context as well as in the classical. 



Chapter 3 

Case Study: Teleportation 



3.1 Introduction 



The phenomenon of teleportation IjBennett et al 



1993|) is perhaps the most striking ex- 



ample of entanglement assisted communication. It illustrates several distinctive features 
associated with quantum information protocols, most notably the fact that entangle- 
ment (a characteristically quantum property) serves as an important resource, and that 
unknown quantum states cannot be cloned. Our main concern in this chapter, though, 
will be to consider teleportation as an example of how conceptual puzzles can arise 
if one thinks of information in the wrong way. That is, if one neglects the fact that 
'information' is an abstract noim. For teleportation has certainly often been seen as 
a conceptually puzzling process. I will suggest that these puzzles generally arise as a 
consequence of a familiar philosophical error — in fact the one that Strawson warns of in 
the epigraph to this Part — that is, the error of assuming that every grammatical sub- 
stantive, in this instance the word 'information', is a referring term. Let us begin with 
a brief review of the teleportation protocol^. 

^Helpful discussions of further conceptual aspects of teleporation, in particular concerning t he rela- 
tion o f teleportation to nonlocality, may be found in Hardv 1 1999). .Barrett. L2001) and .CIifton and Popd 
i2nOll) . iMerminI feoOlah also provides an interesting perspective. 
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Bob 



Figure 3.1: Teleportation. A pair of systems is first prepared in an entangled state and shared 
between Alice and Bob, who are widely spatially separated. Alice also possesses a system in an 
unknown state |x) . Once Alice performs her Bell-basis measurement, two classical bits recording 
the outcome are sent to Bob, who may then perform the required conditional operation to 
obtain a system in the unknown state \x) ■ (Continuous black lines represent qubits, dotted 
lines represent classical bits. Time runs along the horizontal axis.) 

3.2 The quantum teleportation protocol 

In the teleportation protocol we consider two parties, Alice and Bob, who are widely 
separated, but each of whom possess one member of a pair of particles in a maximally 
entangled state. Alice is presented with a system in some unknown quantum state, and 
her aim is to transmit this state to Bob. In the standard example, Alice and Bob share 
one of the four Bell states and she is presented with a spin- 1/2 system in the unknown 
state \x) =a|T) +/?U). 

By performing a suitable joint measurement on her half of the entangled pair and 
the system whose state she is trying to transmit (in this example, a measurement in the 
Bell basis), Alice can flip the state of Bob's half of the entangled pair into a state that 
differs from \x) by one of four unitary transformations, depending on what the outcome 
of her measurement was. If a record of the outcome of Alice's measurement is then sent 
to Bob, he may perform the required operation to obtain a system in the state Alice 
was trying to send (Fig. 13.1(1 . 

The result of the protocol is that Bob has obtained a system in the state |x), with 
nothing that bears any relation to the identity of this state having traversed the space 
between him and Alice. Only two classical bits recording the outcome of Alice's mea- 
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surement were sent between them; and the values of these bits are completely random, 
with no dependence on the parameters a and f3. Meanwhile, no trace of the identity of 
the unknown state remains in Alice's region, as is required in accordance with the no- 
cloning theorem (the state of her original system will usually now be maximally mixed). 
The state has 'disappeared' from Alice's region and 'reappeared' in Bob's, hence the use 
of the term teleportation for this phenomenon. 

Of course, this quantum mechanical process differs from science fiction versions of 
teleportation in at least two ways. First, it is not matter that is transported, but simply 
the quantum state |x); and second, the protocol is not instantaneous, but must attend 
for its completion on the arrival of the classical bits sent from Alice to Bob. Whether or 
not the quantum protocol approximates to the science fiction ideal, however, it remains a 
very remarkable phenomenon from the information-theoretic point of view^ . For consider 
what has been achieved. An unknown quantum state has been sent to Bob; and how 
else could this have been done? Only by Alice sending a quantum system in the state 
Ix) to Bob'^, for she cannot determine the state of the system and send a description 
of it instead. (Recall, it is impossible to determine an unknown state of an individual 
quantum system.) 

If, however, Alice did per impossibile somehow learn the state and send a description 
to Bob, then systems encoding that description would have to be sent between them. In 
this case something that does bear a relation to the identity of the state is transmitted 
from Alice to Bob, unlike in teleportation. Moreover, sending such a description would 
require a very great deal of classical information, as in order to specify a general state 
of a two dimensional quantum system, two continuous parameters need to be specified. 

The picture we are left with, then, is that in teleportation there has been a transmis- 
sion of something that is inaccessible at the classical level (often loosely described as a 
transmission of quantum information) ; in the transmission this information has been in 
some sense disembodied; and finally, the transmission has been very efficient — requiring, 

^Interestingly, it can be argued that quantum telcporation is perhaps not so far from the sci-fi ideal as 
one might initially think. Vaidman 1 1994) suggests that if all physical objects are made from elementary 
particles, then what is distinctive about them is their form (i.e. their particular state) rather than the 
matter from which they are made. Thus it seems one could argue that objects really are teleported in 
the protocol. 

^Or by her sending Bob a system in a state explicitly related to |x) fcf. lPar]3 . ll97Cll . 



CHAPTER 3. CASE STUDY: TELEPORTATION 

apart from prior shared entanglement, the transfer of only two classical bits. 



67 



3.2.1 Some information-theoretic aspects of teleportation 

There are two information-theoretic aspects of the teleporation protocol it is helpful to 
go into in somewhat more detail. The first concerns our reason for saying that a very 
large amount of information is required to specify the state that is teleported. 

As we just noted, in order to describe an arbitrary (pure) state of a two dimensional 
quantum system, it is necessary to specify two continuous parameters. Recalling the 
Bloch sphere representation (cf. Section r2.^{.2(l . we may specify two real numbers (angles) 
to determine a point on the sphere. Why should doing this have associated with it an 
amount of information? If it is to do so we will need to imagine a classical information 
source that is selecting these pairs of angles with various probabilities; then a certain 
Shannon information may be ascribed to the process. Given a particular output of this 
information source, a quantum system is prepared in the state corresponding to the two 
angles selected. The quantum states prepared in this manner will then have associated 
with them a specification information (cf. Section given by the information of the 
source. Once a system has been prepared in some state in this way, it is presented to 
Alice, who may proceed to teleport the state to Bob. 

Rather than the pairs of angles being selected from their full, continuous, range 
of possible values, the surface of the sphere might be coarse-grained evenly to give a 
finite number of choices. One might pick the angles specifying the mid-point, say, of 
each small element of surface area to provide the finite set of pairs of angles to choose 
between. Loosely speaking this coarse-graining corresponds to considering angles only to 
a certain degree of accuracy. As this accuracy is increased (the choices made more finely 
grained), the number of bits required to specify the choice increases without bound. 
If our information source is selecting states to an arbitrarily high accuracy then, the 
specification information is unboundedly large. (On the other hand, if the information 
source is only selecting between a small number of distinct states, then the specification 
information is correspondingly small. From now on we will assume that unless otherwise 
stated, the miknown states to be teleported are selected from a suitable coarse-graining 
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of the whole range of possible angles.) It is essential to note, however, that even if the 
specification information associated with the state that has been teleportcd to Bob is 
exceedingly large, the majority of this information is not accessible to him. This leads 
on to the second point. 

As will be recalled from the earlier discussion (Sections 11.31 12.3.11) when one con- 
siders encoding classical information in quantum systems, it is necessary to distinguish 
between specification information and accessible information. The specification infor- 
mation refers to the information of the classical source that selects sequences of quantum 
states, the accessible information to the maximum amount of information that is avail- 
able following measurements on the systems prepared in these states. In tclcportation, 
of course, the systems are prepared near Alice before teleportation of their states to 
Bob. He may then perform various measurements to try and learn something. Call 
the information of the source selecting the states to be teleported by Alice H{A); the 
mutual information H(A : B) will determine the amount of classical information per 
system that Bob is able to extract by performing some measurement, _B, following suc- 
cessful tclcportation of the unknown state. The accessible information is given by the 
maximum over all decoding measurements of H{A : B). As we know, the Holevo bound 
restricts the amount of information that Bob may acquire to a maximum of one bit of 
information per qubit, that is, to a maximum of one bit of information per successful 
run of the teleportation protocol. 

So this gives us the sense in which the very large amount of information that may be 
associated with the unknown state being teleported to Bob is largely inaccessible to him. 
Note that the amount of information that Bob may acquire from the teleported state 
is less than the amount of classical information — two bits — that Alice had to send to 
him during the protocol. This fact is of the utmost importance, for if the Holevo bound 
did not guarantee this, and Bob were able to extract more than two bits of information 
from his system, then teleportation would give rise to paradox (when embedded in a 
relativistic theory) as super luminal signalling would be possible^. 

^The argument parallels the one given by [Bennett et alj Jl99,'jl to the effect that two full classical 
bits are required in teleportation. In essence, if Bob were able to gain more than two bits of information 
in the protocol, then even if he were not to wait for Alice to send him the pair of bits each time and 
simply guessed their values instead, then some information would still get across. 
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So the Holevo bound ensures that teleportation is not paradoxical, but it also means 
that teleportation, when considered as a mode of ordinary classical information transfer, 
is pretty inefficient, requiring two classical bits to be sent for every bit of information 
that Bob can extract at his end. 



3.3 The puzzles of teleportation 



Let us return to the picture of teleportation that was sketched earlier. An unknown 
quantum state is teleported from Alice to Bob with nothing that bears any relation 
to the identity of the state having travelled between them. The two classical bits sent 
are quite insufficient to specify the state teleported; and in any case, their values are 
independent of the parameters describing the unknown state. The unboundedly large 
specification information characterizing the state — information that is inaccessible at the 
classical level — has somehow been disembodied, and then reincarnated at Bob's location, 
as the quantum state first 'disappears' from Alice's system and then 'reappears' with 
Bob. 

The conceptual puzzles that this process presents seem to cluster around two essential 
questions. First, how is so much information transported? And second, most pressingly, 
just how does the information get from Alice to Bob? 

Perhaps the prevailing view on how these quest ions are to be answered is the one 



that has been expressed by 



Jozsal 1 199S , 



2003|) and 



Penrose! (1998). In their view, the 



classical bits used in the protocol evidently can't be carrying the information, for the 
reasons we have just rehearsed; therefore the entanglement shared between Alice and 
Bob must be providing the channel down which the information travels. They conclude 
that in teleportation, an indefinitely large, or even infinite amount of information travels 
backwards in time from Alice's measurement to the time at which the entangled pair was 
created, before propagating forward in time from that event to Bob's performance of his 
unitary operation and the attaining by his system of the correct state. Teleportation 
seems to reveal that entanglement has a remarkable capacity to provide a hitherto 
unsuspected type of information channel, which allows information to travel backwards 
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in time; and a very great deal of it at that. Further, since it is a purely quantum link 
that is providing the channel, it must be purely quantum information that flows down 
it. It seems that we have made the discovery that quantum information is a new type 
of information with the striking, and non-classical, property that it may flow backwards 
in time. 

The position is summarized succinctly by Penrose: 

How is it that the continuous 'information' of the spin direction of the state 
that she [Alice] wishes to transmit... can be transmitted to Bob when she 
actually sends him only two bits of discrete information? The only other 
link between Alice and Bob is the quantum link that the entangled pair 
provides. In spacetime terms this link extends back into the past from Alice 
to the event at which the entangled pair was produced, and then it extends 
forward into the future to the event where Bob performs his [operation] . 

Only discrete classical information passes from Alice to Bob, so the complex 
number ratio which determines the specific state being 'teleported' must be 
transmitted by the quantum link. This link has a channel which 'proceeds 
into the past' from Alice to the source of the EPR pair, in addition to the 
remaining channel which we regard as 'proceeding into the future' in the nor- 
mal way from t he EPR source to Bob. There is no other physical connection. 
lIPenrosfllTflfli p.l928) 

But one might feel, with good reason, that this explanation of t he nature of informa- 



tion fl ow in teleportation is simply too outlandish. This is the view of lDeutsch and HavdeE 



1 20001) . who conclude instead that with suitable analysis, the message sent from Alice to 
Bob can, after all, be seen to carry the information characterizing the unknown state. 
The information flows from Alice to Bob hidden away, unexpectedly, in Alice's message. 
This approach, and the question of what light it may shed on the notion of quantum 
information, is considered in detail in the next chapter. Suffice it to say at present 
that Deutsch and Hayden disagree with Jozsa and Penrose over the nature of quantum 
information and how it may flow in teleportation. 

One might adopt yet a third, and perhaps more prosaic response to the puzzles that 
teleportation poses. This is to adopt the attitude of the conservative classical quantity 
surveyor^ . According to this view, an amount of information cannot be said to have 
been transmitted to Bob unless it is accessible to him. But of course, as we noted 

resolution a long the s e line s, tie d also to an ens emble view of the quantum state (vide infra) has 
been suggested bv lBarrettI i200l[) and lMorganI <200lD . 
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above, the specification information associated with the state teleported to Bob is not 
accessible to him: he cannot determine the identity of the unknown state. On this view, 
then, the information associated with selecting some unknown state |x) will not have 
been transmitted to Bob imtil an entire ensemble of systems in the state \x) has been 
teleported to him, for it is only then that he may determine the identity of the state^. 
To teleport a whole ensemble of systems, though, Alice will need to send Bob an infinite 
number of classical bits; and now there isn't a significant disparity between the amount 
of information that has been explicitly sent by Alice and the amount that Bob ends 
up with. One needs to send a very large number of classical bits to have transmitted 
by teleportation the very large amount of information associated with selecting the 
unknown state. 

This approach does not seem to solve all our problems, however. Someone sympa- 
thetic to the line of thought espoused by Jozsa and Penrose can point out in reply that 
there still remains a mystery about hmu the information characterizing the unknown 
state got from Alice to Bob the bits sent between them, recall, have no dependence 
on the identity of the unknown state. So while the approach of the conservative classical 
quantity surveyor may mitigate our worry to some extent over the first question, it does 
not seem to help with the second. 

3.4 Resolving (dissolving) the problem 

Dwelling on the question of how the information characterizing the unknown state is 
transmitted from Alice to Bob has given rise to some conundrums. Should we side with 
Jozsa and Penrose and admit that quantum information may flow backwards in time 

down a channel constituted by shared entanglement? Or perhaps with Deutsch and 
Hayden, and agree that information should flow in a less outlandish fashion, but that 

®Note that we will need to adjust our scenario slightly to incorporate this view. In our initial set-up, 
the source A selected a sequence of states which were then teleported one by one to Bob. Now we 
imagine instead that following some particular output of A, an entire ensemble of systems is prepared 
in the pure state associated with that output; then this ensemble of systems — all in the same unknown 
pure state — is teleported. This adjustment is required because in our initial set-up for the teleportation 
procedure, the only way in which an ensemble of systems all in the same state could be teleported to 
Bob would be by setting the information of the source A to zero, with the tiresomely paradoxical result 
that Bob could now determine the state all right, but would gain no information by doing so. 
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quantum information may be squirrelled away in seemingly classical bits? Counting 
conservatively the amounts of information available after teleportation may make us 
less anxious about the load carried in a single run of the protocol, but the question still 
remains: how did the information, in the end, get to Bob? Should we just conclude that 
it is transported nonlocally in some way? But what might that mean? 

If the question 'How does the information get from Alice to Bob?' is causing us 
these difficulties, however, perhaps it might pay to look at the question itself rather 
more closely. In particular, let's focus on the crucial phrase 'the information'. 

Our troubles arise when we take this phrase to be referring to a particular, to some 
sort of substance or entity whose behaviour in teleportation it is our task to describe. 
The assumption common to the approaches of Deutsch and Hayden on the one hand, 
and Jozsa and Penrose on the other, is that we need to provide a story about how some 
thing denoted by 'the information' travels from Alice to Bob. Moreover, it is assumed 
that this supposed thing should be shown to take a spatio-temporally continuous path. 

But recall that 'information' is an abstract noun. This means that 'the information' 
certainly does not refer to a substance or to an entity. The shared assumption is thus 
a mistaken one, and is based on the error of hypostatizing an abstract noun. (We shall 
return to this issue in the context of the Deutsch-Hayden approach once again in the 
following chapter). If 'the information' doesn't introduce a particular, then the question 
'How does the information get from Alice to Bob?' cannot be a request for a description 
of how some thing travels. It follows that the locus of our confusion is dissolved. 

But if it is a mistake to take 'How does the information get from Alice to Bob?' as 
a question about how some thing is transmitted, then what is its legitimate meaning, if 
any? It seems that the only legitimate use that can remain for this question is as a flow- 
ery way of asking: what are the physical processes involved in the transmission? Now 
this question is a perfectly straightforward one, even if, as we shall see f Section I3.5|l . 
the answer one actually gives will depend on the interpretation of quantum mechanics 
one adopts. But there is no longer a conceptual puzzle over teleportation. Once it is 
recognised that 'information' is an abstract noun, then it is clear that there is no further 
question to be answered regarding how information is transmitted in teleportation that 
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goes beyond providing a description of the physical processes involved in achieving the 
aim of the protocol. That is all that 'How is the information transmitted?' can intelli- 
gibly mean; for there is not a question of information being a substance or entity that 
is transported, nor of 'the information' being a referring term. Thus, one does not face 
a double task consisting of a) describing the physical processes by which information is 
transmitted, followed by b) tracing the path of a ghostly particular, information. There 
is only task (a). 

The point should not be misunderstood: I am not claiming that there is no such 
thing as the transmission of information, but simply that one should not understand 
the transmission of information on the model of transporting potatoes, or butter, say, 
or piping water^. 

3.4.1 The simulation fallacy 

Whilst paying due attention to the status of 'information' as an abstract noun provides 
the primary resolution of the problems that teleportation can sometimes seem to present 
us with, there is a secondary possible source of confusion that should be noted. This is 
what may be termed the simulation fallacy. 

Imagine that there is some physical process V (for example, some quantum-mechanical 
process) that would require a certain amount of communication or computational re- 
sources to be simulated classically. Call the classical simulation using these resources 
S. The simulation fallacy is to assume that because it requires these classical resources 
to simulate V using S, there are processes going on when V occurs that are physically 
equivalent to (are instantiations of) the processes that are involved in the simulation 
S itself (although these processes may be being instantiated using different properties 
in V). In particular, when V is going on, the thought is that there must be, at some 
level, physical processes involved in V which correspond concretely to the evolution of 

'^Note that we do sometimes talk of a flow of information; and we do say of many physical quantities 
that are not entities or substances — for example, energy, heat — that they flow. But there is no 
analogy between the two cases, for what this latter description means is that the quantities in question 
obey a local conservation equation. It is not clear that it is at all intelligible to suggest that information 
should obey a local conservation equation. Certainly, the concept of quantity of information that is 
provided by the Shannon theory does not give us a concept of a quantity it makes sense to suggest 
might obey such an equation. (On this, see Section l3 . Kl below. 1 
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the classical resources in the simulation S. The fallacy is to read off features of the 
simulation as real features of the thing simulated'*. 

A familiar example of the simulation fallacy is provided by Deutsch's argu ment that 



Shor's factoring algorithm supports an Everettian view of quantum mechanics l|Deutsch , 
IiOQtI p. 217). The argument is that if factoring very large numbers would require greater 
computational resources than are contained in the visible universe, then how could 
such a process be possible unless one admits the existence of a very large number of 
(superposed) computations in Everettian parallel universes? A computation that would 
require a very large amount of resources if it were to be performed classically is explained 
as a process that consists of a very large number of classical computations. But of 
course, considered as an argument, this is fallacious. The fact that a very large amount 
of classical computation might be required to produce the same result as a quantum 
computation does not entail that the quantum computation consists of a large number 
of parallel classical computations^. 

The simulation fallacy is also evident in the common claim that Bell's theorem shows 
us that quantum mechanics is nonlocal, or the claim that the experimental violation of 
Bell inequalities means that the world must be nonlocal. Of course, what is in fact shown 
by these well-known results is that no local hidden variable model can simulate the 
predictions of quantum mechanics, nor provide a model for the experimentally observed 
correlations. But these facts about simulation don't lead directly to facts about the 
simulated: the fact that any adequate hidden variable model must be nonlocal does not 
show that quantum mechanics is nonlocal (this, of course, is an interpretation dependent 
property), nor show the world to be nonlocal. 

While the question of what classical resources would be required to simulate a given 
quantum process is an indispensible guide in the search for interesting quantum informa- 
tion protocols and is vitally important for that reason, the simulation fallacy indicates 
that it is by no means a sure guide to ontology. 

*Note that it will not always be fallacious to take features of a simulation to correspond to features of 
the simulated — if the features in question are explicitly analogues of features of the system or process 
being simulated. One should thus distinguish between i) simulations that involve analogues and ii) 
functional 'black-box', or input-output simulations. 

^ For further discu s sion o f Deutsch's conception of quantum 'parallel processing', see IStean^ i2(i 
and iHewitt-HorsmanI 120021) . 
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With regard to teleportation, it is important to recognise the simulation fallacy 
in order to assuage any worries that might remain over the question 'How does so 
much information get from Alice to Bob?', and to undermine further the thought that 
teleportation must be understood as a flow of information. 

For the fact that it would take a very large number of classical bits to transmit the 
identity of an unknown state from Alice to Bob does not entail that in teleportation 
there is a real corresponding transmission of information, some physical process going 
on that instantiates, albeit in a different medium, the transport of this large amount 
of information^*^. (Note that the flow of the hypostatized 'quantum information' of 
Jozsa and Penrose plays precisely this role: the analogue, in a different medium, of the 
transport of the large amount of classical information.) Equivalence from the point of 
view of information processing does not imply physical equivalence. 

Awareness of the simulation fallacy is particularly relevant when we consider the 
approach of the conservative classical quantity surveyor. Recall that the point of this 
approach is to deny that a large amount of information can be said to have been trans- 
ported to Bob in teleportation until that information is actually available to him. How- 
ever, it might be objected to this that after a single run of the teleportation protocol, 
the information characterizing the state is certainly present at Bob's location, even if 
inaccessible to him, as a system in the unknown state is present^^. 

This contention would seem to rest on an argument of the following form: The only 
way the unknown state can appear at Bob's location is if the information characterizing 
the state has actually been transported to Bob, hence on appearance of the state, the 
specification information associated with the state has indeed been transported to Bob's 
location. (Crudely, if a system in the given state is present, then the information is 
present, as it takes this information to specify the state.) But such an argument needs 
to be treated with care, for the main premise appears to rest on the simulation fallacy. 

^"Nor, for example, does the fact that there are protocols in which the state of a qubit can be 
substituted for an arbitrarily large amount of classical information iGalvao and Hardv, 2003) imply 
that this large amount of information is really there in the qubit. 

^'^It is for this reason that it is natural to marry conservative classical quantity surveying with an 
ensemble view of the quantum state (see footnote 0, for then this objection would not go through — 
when the two positions are conjoined, not only is the information characterizing the state not available 
until the whole ensemble is teleported, but neither has the state been teleported until the whole ensemble 
has been teleported to Bob. 
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Just because it would take a large amount of information to specify a state doesn't mean 
that we should conclude that this amount of information has been physically transported 
in teleportation when Bob's system acquires the state. 

In any event, the simplest way to remain clear on whether or not, or in what way, 
information can be said to be present at Bob's location following a single run of the 
teleportation protocol is to respect the distinction between the specification information 
associated with a system and the amount of information that may be said to be encoded 
or contained in the system. Once Bob's system has acquired the state |x) teleported 
by Alice, then his system has associated with it the same specification information, 
H{A): if one were now asked to specify the state of Bob's system, then this number 
of bits would be required, on average. This quantity of information is not encoded or 
contained in the system however. The mutual information H{A : B) and the accessible 
information provide the relevant measures of how much information Bob's system can 
be said to contain, for they govern the amount that may be decoded. But of course, as 
'information' is an abstract noun, containing information is not containing some thing, 
however aethereal. 

3.5 The teleportation process under different inter- 
pretations 

By reflecting on the logico-grammatical status of the term 'information' we have been 
able to replace the (needlessly) conceptually puzzling question of how the information 
gets from Alice to Bob in teleportation, with the simple, genuine, question of what 
the physical processes involved in teleportation are. While this may not, perhaps, be 
quite enough to still all the controversy that trying to understand teleportation has 
evoked, the controversy is now of a very familiar kind: it concerns what interpretation 
of quantum mechanics one adopts. For the detailed story one tells about the physical 
processes involved in teleporation will of course depend upon one's interpretive stance. 
Two questions in particular will find different answers under different interpretations: 
first, is nonlocality involved in teleportation? and second, has anything interesting 
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happened before Alice's classical bits are sent to Bob and he performs the correct unitary 
operation? 

We will now see how some of these differences play out in the following familiar 
interpretations (the list of approaches considered is by no means exhaustive). 



3.5.1 Collapse interpretations: Dirac/von Neumann, GRW 



von Neumann 



The n atural place to begin is with the orthodox approach of iDirad lll947j) andt 
1 1955 1 in which there is a genuine process of collapse on measurement^^. (The vagueness 
over where, w'hen, why and how th is collapse takes place might be alleviated along lines 



suggested by 



Ghirardi et al 



then as noted long ago by 



). perhaps.) If one has a genuine process of collapse 



Einstein et al 



Ijl985ll ^^. one has action-at-a-distance. In the 
presence of entanglement, a measurement on one system can result in a real change 
to the possessed properties of another system, even when the two systems are widely 
separated. (Although, as is well known, these changes do not allow one to send signals 
super luminally — this is known as the no-signalling theorem}'^.) 

In teleportation, then, under a collapse interpretation, the effect of Alice's Bell-basis 
measurement will be to prepare Bob's system, at a distance, in one of four pure states 
which depend on the unknown state |x) , by using the nonlocal effect of collapse. It then 
only remains for Alice to send her two bits to Bob to tell him which (type of) state he 
now has in his possession. Under this interpretation, teleportation explicitly involves 
nonlocality, or action-at-a-distance; and it is precisely because of the nonlocal effect of 
collapse, preparing Bob's system in a state that differs in one of only four ways from 
Ix), that a mere two classical bits need be sent by Alice in order for Bob's system to 
acquire a state parameterised by two continuous values. 

It is enlightening to compare the effect of collapse in this scenario to that of a rigid 
rod held by two parties. Imagine that Alice wanted to let Bob know the value of a 
parameter that could take on values in the interval [0,1]. If they were each holding 

^^One of the defining features of what I here term 'orthodoxy' is the adoption of t he standar d 
eigenst ate-eigenvalue fink for the as cription of definite values to quantum systems. See e.g. iBulj J1997D . 



See Irimpson and Brownl i2002D for a recent discussion. 
^"^An early versi on of the no-s ignalling theorem, specialised to the case of s pin 1/2 EPR- t ype exper- 
iment s appears in iBohml il95lD . Later, more general versions are given bv lTausU il96'1) : lEberhardl 
<197^ : lGliiradi et al.1 1198(1) . See also IShimonvlll984l . iRedheadI Jl98l Chpt. 4.6). 



CHAPTER 3. CASE STUDY: TELEPORTATION 



78 



one end of a long rigid rod, then Alice could let Bob know the value she has in mind 
simply by moving her end of the rod along in Bob's direction by a suitable distance. 
Bob, seeing how far his end of the rod moves, may infer the value Alice is thinking 
of^^. There is no mystery here about how the value of the continuous parameter is 
transmitted from Alice to Bob. Alice, by moving her end of the rod, moves Bob's by a 
corresponding amount. In teleportation, the effect of collapse is somewhat analogous: 
Bob's system is prepared, by the nonlocal effect of collapse, in a state that depends on 
the two continuous parameters characterizing \x)- As we have said, collapse allows a 
real change in the physical properties that a distant system possesses, if there was prior 
entanglement. Compare: pushing one end of a rigid rod axially leads to a change in the 
position of the far end. The nonlocal effect of collapse, which is here understood as a 
real physical process, is providing the main physical mechanism behind teleportation; 
and recall that once the physical mechanisms have been described (I have argued) there 
is no further question to be asked about how information is transmitted in the protocol. 

In a collapse interpretation, teleportation thus involves nonlocality, in the sense of 
action-at-a-distance, crucially. Also, something interesting certainly has happened once 
Alice performs her measurement and before she sends the two classical bits to Bob. 
There has been a real change in the physical properties of Bob's system, as it acquires 
one of four pure states. (Although note that at this stage the probability distributions 
for measurements on Bob's system will nonetheless not display any dependence on the 
parameters characterizing \x), in virtue of the no-signalling theorem. It is only once 
the bits from Alice have arrived and Bob has performed the correct operation that 
measurements on his system will display a dependence on the parameters a and (3.) 

3.5.2 No collapse and no extra values: Everett 

It is possible to retain the idea that the wavefunction provides a complete description 
of reality while rejecting the notion of collapse; this way lies the Everrett interpretation 



I Everett . 



1957|) . The characteristic feature of the Everett interpretation is that the 



^^Of course, in a relativistic setting, rigid bodies would not be permissible, although they are in 
non-relativistic quantum mechanics. This does not in any case affect the point of the analogy. 

^^It should be noted that there have been a number of different attempts to develop Everett's original 
ideas into a full-blown interpretation of quantum theory. The most satisfactory of these would appear 
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dynamics is always unitary; and no extra values are added to the description provided 
by the wavefunction in order to account for definite measurement outcomes. Instead, 
measurements are simply unitary interactions which have been chosen so as to correlate 
states of the system being measured to states of a measuring apparatus. Obtaining a 
definite value on measurement is then understood as the measured system coming to have 
a definite state (eigenstatc of the measured observable) relative to the indicator states 
of the measuring apparatus and ultimately, relativ e to an observe r^ ^. A treatrnent o f 



Vaidmanll|l994 ). 



BraunsteinI (|l99(ir 



teleportation in the Everettian context was given by 
provides a detailed discussion of the teleportation protocol within unitary quantum 
mechanics without collapse. 

With teleportation in an Everettian setting, and unlike teleportation under the or- 
thodox account, it is clear that there will be no action-at-distance in virtue of collapse 
when Alice performs her measurement, for the simple reason that there is no process of 
collapse. Instead, the result of Alice's measurement will be that Bob's system comes to 
have definite relative states related to the unknown state with respect to the indi- 
cator states of the systems recording the outcome of Alice's measurement. (It will be 
argued in the next chapter that this does not amount to a new form of nonlocality.) Note, 
though, that at this stage of th e protocol, the redu ced state of every system involved will 



now be maximally mixed^^. As 



BraunsteinI l|1996(l notes, this feature corresponds to the 



'disembodiment' of the information characterizing the unknown state in the orthodox 

account of teleportation: following Alice's measurement, all the systems involved in the 

to be an approach on the Unes of Saunders and Wallace ISaunderd [l995l Il996bl Il998l Il996at IWallacd. 

I2OO2L 122223) which resolves the preferred basis problem and has made considerable progress on the 
questio n of the meaning of probability in Everett (on this, see in particular Deutsch (1999); Wallac3 
i22223))- 

^'^This is the case for ideal first-kind (non-disturbing) measurements. The situation becomes more 
complicated when we consider the more physically realistic case of measurements which are not of 
the first kind; in some cases, for example, the object system may even be destroyed in the process of 
measurement. What is important for a measurement to have taken place is that measuring apparatus 
and object system were coupled together in such a way that if the object system had been in an 
eigenstate of the observable being measured prior to measurement, then the subsequent state of the 
measuring apparatus would allow us to infer what that eigenstate was. In this more general framework 
the importance is not so much that the object system is left in a eigenstate of the observable relative 
to the indicator state of the measuring apparatus, but that we have definite indicator states relative to 
macroscopic observables. 

^^This would not in general be the case if the initial entangled state were not maximally entangled, 
or if Alice's measurement were not an ideal measurement; with these eventualities, the teleportation 
would be imperfect (fidelity less than 1). 
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protocol will have become fully entangled. Dependence on the parameters characterizing 
the unknown state will only be observable with a suitable global measurement, not for 
any local measurements. In particular, one can consider the correlations that now exist 
between the systems recording the outcome of Alice's measurement and Bob's system. 
Certain of the joint (and irreducible) properties of these spatially separated systems will 
depend on the identity of the unknown state. In this sense, the information character- 
izing \x) might now be said to be 'in the correlations' between these systems. (This is 
the terminology Braunstein adopts.) 

Once Bob has been sent the systems recording the outcome of Alice's measurement, 
however, he is able to disentangle his system from the other systems involved in the 
protocol. Its state will now factorise from the joint state of the other systems; and will 
in fact be the pure state Dependence on the parameters a and (3 will finally be 
observable for local measurements once more, but this time, only at Bob's location. 

In collapse versions of quantum mechanics, the nonlocal effect of collapse was the 
main physical mechanism underlying teleportation. In the no-collapse Everettian set- 
ting, the fundamental mechanism is provided by the fact that in the presence of en- 
tanglement, local unitary operations — in this case, Alice's measurement — can have a 
non-trivial effect on the global state of the joint system. 

So, has anything significant happened at Bob's location before Alice sends him the 
result of her measurement and he performs his conditional unitary operation? Well, 
arguably not: nothing has happened other than all of the systems involved in the protocol 
having become entangled, as a result of the various local unitary operations. 



3.5.3 No collapse, but extra values: Bohm 



The Bohm theory account provides us with an interesting intermediary view of tele- 
portation, in which there is no collapse of the wavefunction, but nonloc ality plays an 



interesting role. We shall follow the analysis of 



The Bohm theory l|Bohm , 



Maronev and HilevI l|l999|) 



1952() is a nonlocal, contextual, deterministic hidden vari- 
able theory, in which the wavefunction \E'(xi,X2 . . .x„,i) of an n-body system evolves 
unitarily according to the Schrodinger dynamics, but is supplemented with definite val- 
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ues for the positions xi(t), X2(t) . . . x„(t) of the particles. Momenta are also defined 
according to = V^S*, where S is the phase of ^, hence a definite trajectory may be 
associated with a system, where this trajectory will depend on the many-body wave- 
function (and thus, in general, on the positions and behaviour of all the other systems, 
however far away). If the initial probability distribution for particle positions is assumed 
to be given by l^'P, then the same predictions for measurement outcomes will be made 



as in ordinary quantum me chanics. For det ailed presentations of the Bohm theory, see 



Bohm and HilevI ||199,1D and 



The guiding effect of the wavefunction on the particle positions may also be under- 
stood in terms of a new quantum potential that acts on particles in addition to the 
familiar classical potentials. The quantum potential is given by 

g(xi,X2,...,x„) = -n2y-^^^, 

2—1 

where R is the amplitude of ^' and is the mass of the i-th particle. Among the ways 
in which this quantity differs from a classical potential is that it will in general give rise 
to a nonlocal dynamics (that is, in the presence of entanglement, the force on a given 
system will depend on the instantaneous positions of the other particles, no matter how 
far away) ; and it may b e large even when the amplitude from which it is derived is small. 



Bohm and HilevI l|199.li §3.2) suggest that the quantum potential should be understood 
as an 'information potential' rather than a mechanical potential, as a way of accounting 
for its peculiar properties. 

The determinate values for position in the Bohm theory are usually understood as 
providing the definite outcomes of measurement^^ that would appear to be lacking in a 
no-collapse version of quantum mechanics, in the absence of an Everett-style relativiza- 
tion. Following a measurement interaction, the wavefunction of the joint object-system 
and apparatus will have separated out (in the ideal case) into a superposition of non- 
overlapping wavepackets (on configuration space) corresponding to the different possible 
outcomes of measurement. The determinate values for the positions of the object-system 

^^Note, though, that measurement may not usually be understood as revealing pre-existing values in 
the Bohm theory. 
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and apparatus pointer variable will pick out a point in configuration space; and the out- 
come that is observed, or is made definite, is the one corresponding to the wavepacket 
whose support contains this point. The wavefunction for the total s ystem remains as 
a supe rposition of all of the non-overlapping waverpackets, however. 



Bohm and Hilev 



1 199,1) introduce the notions of active, passive and inactive information to describe 
this feature of the theory. If 5" may be written as a superposition of non-overlapping 
wavepackets, then they suggest that the definite configuration point of the total system 
picks out one of these wavepackets (the one whose support contains the point) as ac- 
tive. The evolution of the point is determined solely by the wavepacket containing it; 
and in keeping with their conception of Q as an information potential, the information 
associated with this wavepacket is said to be active. The information associated with 
the other wavepackets is termed either 'passive', or 'inactive'. 'Passive', if the different 
wavepackets may in the future be made to overlap and interfere, 'inactive' if such inter- 
ference would be a practical impossibility (as for example, if environmental decoherence 
has occurred in a measurement-type situation — this corresponds to the case of 'effective 
collapse' of the wavefunction). 

In their discussion of the teleportation protocol, Maroney and Hiley adopt the ap- 
proach in which a definite spin vector is also associated with each spin 1/2 particle, 
in addition to its definite position. The idea is that with each system is associated an 
orthogonal set of axes (body axes) whose orientation is specified by a real three dimen- 
sional spin vector, s, along with an angle of rotation about this vector; where these 
quantities are determined by the wavefunction^*^. 

The analysis of teleportation then proceeds much as in the Everett interpretation, 
save that we may also consider the evolution of the determinate spin vectors associated 
with the various systems. Initially, the system in the unknown state \x) will have 
some definite spin vector that depends on a and /3, s(a,/3), while it turns out that if 
Alice and Bob share a s inglet state, the spin vectors for their two systems will be zero 



I Bohm and Hilev 



199,'! §10.6). Now Alice performs her Bell-basis measurement. As 



20 



— This is the appro ach to spin of Bohm ot al.i Foi" ^ systematic presentation see 

iBohm and HilevI <1 99j. §10.2-10.3) or tHoUand a995i Chpt. 9). Other approaches to spin are pos- 
sible, e.g.. iBohm a nd Hilev 1 199^ SlO.4-10.5'). iHoUandl <199,'iL Chpt. 10), or the 'minimalism' of lBeU 
<1966Lll98lir in which no spin values are added. 
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in the Everettian picture, the effect of measurement is to entangle the systems being 
measured with systems recording the outcome of the measurement. But this is not 
the only effect, in the Bohm theory. The total wavefunction is now a superposition of 
four terms corresponding to the four possible outcomes of Alice's measurement; and one 
of these four terms will be picked out by the definite position value of the measuring 
apparatus pointer variable. For each of these four terms taken individually, Bob's system 
will be in a definite state related to the state |x), thus with each will be associated a 
definite spin vector s^(a,/3), j — 1, ... ,4, pointing in some direction. When one of the 
four terms is picked out as active, and the others rendered passive (or inactive), following 
Alice's measurement, the spin v ector for Bob's system will change instantaneously from 



zero to one of the four s^ (a, f3) IjMaronev and Hilev 



Thus in the Bohm theory, teleportation certainly involves nonlocality; and moreover 
something very interesting does happen as soon as Alice has made her measurement 
Bob's system acquires a definite spin vector that depends on the p arameters character 



izing the unknown state, as a result of a nonlocal quantum torque IjMaronev and Hilev 



1999|) . Furthermore, there is a one in four chance that this spin vector will be the same as 
the original s{a, (3); and all this while the total state of the system remains uncollapsed, 
with all the particles entangled. 

Finally, as we have seen before, once Alice sends Bob systems recording the outcome 
of her measurement, he may perform the conditional unitary operation necessary to 
disentangle his system from the others, and leave his system in the state The spin 
vector of his system will now be s(a, (3) with certainty. 

A note on active information 



The conclusion of 



Maronev and HilevI l|1999|) and 



HilevI l|1999|) is that according to the 



Bohm theory, what is transferred from Alice's region to Bob's region in the teleportation 
protocol is the active information that is contained in the quantum state of the initial 
system. However questions may be raised about how apposite this description is. 

For ease of reference, let us introduce labels for some of the systems involved in 
teleportation. Call the system whose unknown state is to be teleported, system 1; 
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Alice's half of the entangled pair, system 2; and Bob's half, system 3. Also let us label 
the pointer degree of freedom of the measuring apparatus by xq. At the beginning of 
the teleportation protocol, the state of system 1 factorises from the entangled joint state 
of 2 and 3; and the state of the measuring apparatus will also factorise. Accordingly, 
the quantum potential will be given by a sum of separate terms: 

(3(xi,X2,X3,a;o) = (5(xi,a,/3) + (3(x2,X3) + Q{xo), (3.1) 

where it has been noted that the first term, the one that will determine the motion of 
system 1, depends on the parameters characterizing the unknown state^^. 

Once Alice performs her Bell basis measurement, however, all the systems become 
entangled; and the potential will be of the form: 

(3(xi,X2,X3,a;o) = (3(xi, X2, xq) + Q(x3, xq, a, /3) (3.2) 

The part of the quantum potential that will affect system 3 now depends on a and /3. 

Finally, at the end of the protocol, systems 1, 2 and the measuring apparatus are 
left entangled; and system 3, in the pure state |x)3i factorises. The quantum potential 
then takes the form: 

(3(xi,X2,X3,a;o) = Q(xi, X2, xq) + Q(x3,a,/3) (3.3) 

Maroney and Hiley say: 

What we see clearly emerging here is that it is active information that has 
been transferred from particle 1 to particle 3 a nd that this transfer has 
been mediated by the nonlocal quantum potential. l|Maronev and HilevlllQQfll 
p. 1413) 

...it is the objective active information co ntained in the wavefunctio n that is 
transferred from particle 1 to particle 3. l|Maronev and HilevlllQQfll d.1414'1 

Note that the part of the potential that is active on system 3 will already have 

■^^The c omponent of th e force on the i-th system due to the quantum potential is given by rriiXi = 
-ViQ Ccf. lHoUandi 1199.1 §7.1.2); therefore, only terms in the sum which depend on will contribute 
to the motion of the i-th system. 
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acquired a dependence on a and /3 before the end of the protocol; that is, as soon 
as Ahce has performed her measurement. So if active information depending on these 
parameters is transferred at all, it will have been transferred before the end of the 
protocol. However it is not until Alice has sent her message to Bob and he performs his 
conditional operation that the term (3(x3,q;,/3) in ean. 13.31 will take the same form as 
the initial (3(xi, a, /?). 

The difficulties for the stated conclusion arise when we consider more c losely what is 
meant by 'active information'. In 



Maronev and HilevI l|1999D : 



HilevI l|1999^ . the connec- 



tion is made with a different sense of the word 'information' than the ones with which 
we have so far been concerned in this thesis. This is a sense that derives from the verb 
'inform' under its branch I and II senses (Oxford English Dictionary), viz. to give form 
to, or, to give formative principle to (this latter, a Scholastic Latin offshoot). 

Thus 'information' as it appears in 'active information' and company, means the 
action of giving form to^^. 'The information of x' (read: The m-formation of .t) means 
the action of giving form to x. 

Now, while we may understand what is meant by Q being said to be an information 
potential — it is a potential that gives form to something, presumably the possible 
trajectories associated with particles (although note that the distinction with mechanical 
potentials is now blurred, as these give form to the possible trajectories too) — and may 
understand the term 'active' as picking out the part of the quantum potential that is 
shaping the actual trajectory in configuration space of the total system, it does not make 
sense to say that active information is transferred in teleportation. Because 'information' 
here refers to a particular action — the giving of a form to something — and an action 
is not a thing that can be moved^^. The same type of action may be taking place at two 
different places, or at two different times, but an action may not be moved from A to 
B. 

Thus with 'active information' understood in the advertised way, all that can be said 
is that an action of the same type is being performed (by the quantum potential) on 

^^Cf. OED 'information', sense 7. 

^^On some accounts, an action is the bringing about of some event or state of affairs by an agent 



^ Alvarez and HvmanL [ToflSft : on others, an action is an event JPavidsonL 1198(1) . On no account is an 
action something which can intelligibly be said to be moved about. 
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system 3 at the end of the teleportation protocol as was being performed on system 1 
at the beginning, not that something has been transferred between the two. We may 
not, then, understand 'transfer' Uterally. When ah is said and done, it is perhaps clearer 
simply to adopt the standard description and say that the quantum state of particle 1 
has been 'transferred' in teleportation; that is (as a quantum state is a mathematical 
object and therefore cannot literally be moved about either), that system 3 has been 
made to acquire (is left in) the unknown state 

To sum up: it perhaps looked as if the Bohmian notion of active information might 
provide us with a sense of what is transported in teleportation if we insist that informa- 
tion, 'the information in the wavefunction', is, in a literal sense, transported. But this 
proves not to be the case. 

3.5.4 Ensemble and statistical viewpoints 

So far, in all the interpretations we have considered, the quantum state may describe 
individual systems. Let us close this section by looking briefly at approaches in which 
the state is taken only to describe ensembles of systems. 

We may broadly distinguish two such approaches. The first I will term an ensemble 
viewpoint. In this approach, the state is taken to represent a real physical property, but 
only of an ensemble. Following a measurement, the ensemble must be left in a proper 
mixture^'*, in order for there to be definite outcomes, i.e., the ensemble is left in an 
appropriate mixture of sub-ensembles, each described by a pure state (eigenstate of the 
measured observable). Thus there will be a real process of collapse, but only at the 
level of the ensemble, not for individual systems (which are not being described by a 
quantum state, if at all). 

The second approach I call a statistical interpretation. (This is the interpretation 
that would be adopted by instrumentalists, for example.) On this view, the quantmn 
formalism merely describes the probabilities for measurement outcomes for ensembles, 
there is no description of individual systems and collapse does not correspond to any 
real physical process. 

•^'^ See Id'Espagna^ 119761) for this terminology, also lTimpson and Browril l2004) . 
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On both these approaches, as the state is only associated with an ensemble, it is 
not until an entire ensemble has been teleported to Bob (that is, Alice has run the 
teleportation protocol on every member of an ensemble in the unknown state |x)) that 
he acquires something in the state \x) ■ An ensemble or statistical viewpoint thus makes 
a natural partner to conservative classical quantity surveying in teleportation. 

Under the statistical interpretation, there is clearly no nonlocality involved in telepor- 
tation, as there is no real process of collapse; and nothing of any interest has happened 
before the required classical bits are sent to Bob. (The no-signalling theorem entails 
that Alice's measurement won't affect the probability distributions for distant measure- 
ments.) The end result of the completed teleportation process is that Bob's ensemble 
is ascribed the state |x); where this merely means that the statistics one will expect for 
measurements on Bob's ensemble are now the same as those one would have expected 
for measurements on the initial ensemble presented to Alice. 

The ensemble viewpoint presents a rather different picture, as it docs involve a 
real process of collapse, even if only at the ensemble level. Let us suppose that Alice 
has performed the Bell basis measurement on her ensembles, but has not yet sent the 
ensemble of classical bits to Bob. The effect of this measurement will have been to leave 
Bob's ensemble in a proper mixture composed of sub-ensembles in the four possible 
states a fixed rotation away from |x). Thus there has been a nonlocal effect: that 
of preparing what was an improper mixture into a particular proper mixture, whose 
components depend on the parameters characterizing the unknown state. The use of 
the flock of classical bits that Alice sends to Bob is to allow him to separate out the 
ensemble he now has into four distinct sub-ensembles, on each of which he performs the 
relevant unitary operation, ending up with all four being described by the state |x)- 

3.6 Concluding remarks 

The aim of this chapter has been to show how substantial conceptual difficulties can arise 
if one neglects the fact that 'information' is an abstract noun. This oversight seems to 
lie at the root of much confusion over the process of teleportation; and this gives us very 
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good reason to pay attention to the logical status of the term. A few closing remarks 
should be made. 

Schematically, a central part of the argument has been of the following form: 

Puzzles arise when we feel the need to tell a story about how something travels from 
Alice to Bob in t deportation. In particular, it might be felt that this something needs to 
travel in a spatio-temporally continuous fashion; and one might accordingly feel pushed 
towards adopting something like the Jozsa/Penrose view. 

But if 'the information' doesn't pick out a particular, then there is no thing to take 
a path, continuous or not, therefore the problem is not a genuine one, but an illusion. 

We can imagine a number of objections. A very simple one might take the following 
form: You have said that information is not a particular or thing, therefore it does not 
make sense to inquire how it flows (but only inquire about the means by which it is 
transmitted). But don't we have a theory that quantifies information {viz. communi- 
cation theory); and if we can say how much of something there is, isn't that enough to 
say that we have a thing, or a quantity that can be located? 

This objection is dealt with quickly. Note that this form of argument will not work 
in general — one can say how much a picture might be worth in pounds and pence, for 
example, but this is not quantifying an amount of stuff, nor describing a quantity with 
a location — and it does not work in this particular case either (cf. Section [TT^ . The 
Shannon information doesn't quantify an amount of stuff that is present in a message, 
say, nor the amount of a certain quantity that is present at some spatial location. The 
Shannon information H{X) describes a specific property of a source (not a message), 
namely, the amount of channel resources that would be required to transmit the mes- 
sages the source produces. This is evidently not to quantify an amount of stuff, nor to 
characterize a quantity that has a spatial location. (The source certainly has a spatial 
location, but its information does not.) Or consider the mutual information. Loosely 
speaking, this quantity tells us about the amount we may be able to infer about some 
event or state of affairs from the obtaining of another event or state of affairs. But how 
much we may infer is not a quantity it makes sense to ascribe a spatial location to. 

Another objection might be as follows: You have suggested that it is a mistake to 
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hypostatize information, to talk of it as a thing that moves about. How is this to be 
reconciled with some of the ways we often talk about information in physics, especially 
the example in relativity, where the most natural way of stating an important constraint 
is to say that relativity rules out the propagation of information faster than the speed 
of hght? 

The response is that one can admit this mode of talking without it entailing a 
hypostatized conception of information. The constrai nt is that supe rluminal signalling 



Rindler 



199ll §7.ix). What this 



is ruled out on pain of temporal loop paradoxes (e.g. 
means is that no physical process is permissible that would allow a signal to be sent 
superluminally and thus allow information to be transmitted superluminally. What are 
ruled out are certain types of physical processes, not, save as a metaphor, certain types 
of motion of information^^ . 

A final objection that might be raised to support the line of thought that inclines 
one towards the Jozsa and Penrose conception of teleportation is just this: Well, don't 
we after all require that information be propagated in a spatio-temporally continuous 
way? Even if this is not to be construed as a flow of stuff, or the passage of an entity? 

The response illustrates part of the value of noting the features of the term 'infor- 
mation' that have been emphasized in this chapter. 

The genuine question we face is: what are the physical processes that may be used 
to transmit information? Not the (obscure) question 'How does information behave?'. 
Once we see what the question is clearly, then the answer, surely, is to be given by our 
best physical theory describing the protocol in question. To be sure, many of the most 
familiar classical examples we are used-to use spatio-temporally continuous changes in 
physical properties to transmit information (a prime example might be the use of radio 
waves), but it is up to physical theory to tell us about the nature of the processes we are 
using to transmit information in any given situation. And the examples we have found 
in entanglement assisted communication seem precisely to be examples in which global 
rather than local properties are being used to carry information; and there seems not to 

^^The types of processes in question might not be identifiable witfiout recourse to concepts of what 
would count as successful transmission of information, but this does not mean that one has to conceive 
of information as an entity or substance, just that one needs a concept of what it means to receive a 
signal from which one can learn something. 
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be a useful sense in which information is being carried in a spatio-temporally continuous 
way (although, see Chapter 0] for further discussion of Deutsch and Hayden's opposing 
view). 

It is not the nature of information that is at issue, but the nature of the physical 
objects and the physical properties we may use to transmit information. 

The value of getting clear on the real nature of the question one faces about infor- 
mation transmission in teleportation will become evident again in the following chapter. 

On a final note, the defiationary approach that has been adopted towards teleporta- 
ti qn in thi s chap ter should be compared with what may be termed the 'nihilist' approach 



of lDuwelll l|200,'^ . While I am in broad sympathy with much of what Duwell has to say, 
we differ on some important points. Duwell also advocates the view that quantum in- 
formation is not a substance, but reaches from this the strong conclusion that quantum 
information does not exist. From the current point of view this conclusion is unwar- 
ranted. Certainly, quantum information is not a substance or entity, but this does not 
mean that it doesn't exist, it is just a reflection of the fact that 'information' is an 
abstract noun. 'Beauty' for example, is an abstract noun, but no one would want to 
conclude that there is no beauty in the world. Moreover, Duwell's conclusion could only 
possibly be hyperbolical, for if classical information can be said to exist, then so too 
can quantum information; and contrapositively, if quantum information does not exist, 
then no more does classical information. The concept of classical information is given by 
Shannon's noiseless coding theorem, the concept of quantum information, by the quan- 
tum noiseless coding theorem. As we are by now vividly aware, these are not concepts of 
material quantities or things. But rejecting the concept of quantum information would 
be akin to cutting off one's nose to spite one's face; and is by no means necessary in 
order to get a proper understanding of teleportation. 

Teleportation is not rendered unproblematic by trying to do without the notion of 
quantum information and facing the protocol equipped only with Shannon's concept, 
but simply by resisting the temptation to hypostatize an abstract noun; and, having 
recognised the status of 'information' as an abstract noun, by realising that the only 
genuine question one faces is the relatively straightforward one of describing the physical 
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processes by which information is transmitted. 



Chapter 4 

The Deutsch-Hayden 
Approach: Nonlocality and 
Information Flow 



4.1 Introduction 



The existence of entanglement, and the associated questions conc erning nonlocahty, are 



of perennial interes t in the foundations of quantum mechanics 



Schrodinger 



1935a, 



198f)t iBei: 



1964; 



Redheac 



1987 



Maudlin . 1200 



( Einstein et al 



1935 



As we have 



seen, following t he development of quantum information theory, e ntanglement assisted 



communication I Bennett and Weisner 



1992; 



Bennett et al 



19931) has presented a new 



sphere in w hich puzzles may arise. In th is context, an important development has been 



the claim of 



Deutsch and Havdenl l|2000ll to provide an especially local story about quan- 



tum mechanics, by making use of the Heisenberg picture. They claim, moreover, finally 
to have clarified the nature of information flow in entangled quantum systems, reaching 
the conclusion that information is a local quantity, even in the presence of entanglement. 
The approach of Deutsch and Hayden was mentioned in passing in the previous chapter. 
The aim of this chapter is to assess their claims in detail. 

Their discussion takes place within the context of unitary quantum mechanics with- 
out collapse, and without the addition of determinate values; and they proceed to make 
two claims to locality. First, they suggest, even in the presence of entanglement, the 
state of the global system can in fact be seen to be completely determined by the 



92 
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states of the individual subsystems, when these states are properly construed (a con- 
clusion not available in the usual Schrodinger picture and one supposed to ch ime with 



Einst ein's well-known demand for a real state for spatially separated systems IjSchilnn . 
I1979I pp. 77-83)). Second, the effects of local unitary operations, again, even in the 
presence of entanglement, are explicitly seen to be local in their picture. 

However, before the implications of their formalism may be assessed, something needs 
to be said about how it is to be interpreted. Deutsch and Hayden are not explicit on 
this point and do not offer any interpretation. This proves problematic as two different 
modes of interpretation of their formalism may be discerned — what may be called the 
conservative and the ontological interpretations — and quite different conclusions follow 
concerning the questions of locality and information flow within these interpretations. 

The conservative interpretation, perhaps the most natural way of reading the Deutsch- 
Hayden paper, takes the formalism at face value, simply as a re-writing of standard 
unitary quantum mechanics. In this case, we shall see, there are no novel gains with re- 
spect to locality and Deutsch and Hayden's claims about information flow prove at best 
misleading. Under the ontological interpretation, though, a dramatic departure from 
our usual ways of understanding quantum mechanics is made and a wholly new range 
of intrinsic properties of subsystems introduced. These would substantiate Deutsch and 
Hayden's claims, but at a certain cost of plausibility. We should note too that the onto- 
logical interpretation of the Deutsch-Hayden formalism is best seen as the postulation 
of a new type of theory, rather than being a new way of interpreting familiar quantum 
mechanics. 

The discussion will begin in Section f4.2l where the machinery of the Deutsch-Hayden 
approach is outlined, in particular, the mathematics that lies behind the two claims to 
locality. These claims are then assessed fSection l4.3f) . for the conservative and ontological 
interpretations in turn. 

Note that in Deutsch-Hayden we have a formalism without collapse and without 
the addition of determinate values. If we are to consider the question of the locality of 
their approach, the appropriate comparisons are therefore with other approaches that 
are consistent with this assumption. On the one hand we should compare with a realist 
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ripe jsvereti I1957I Isaunderl ll 



approach of the Everett stripe IjEverettl ll957HSaundera . Il996ai iWallaceL l2002() . while on 
the other we should compare with a form of statistical interpretation, by which, recall, I 
mean an interpretation in which quantum mechanics merely describes probabilities for 
measurement outcomes for ensembles, there is no description of individual systems and 
collapse does not correspond to any real physical process for individual systems. The 
question to be answered, then, is: do Deutsch and Hayden present us with advantages 
with respect to locality that are not also shared by these other approaches? We shall 
see that under the conservative interpretation, they do not. 

In Section f4.4l attention finally turns to the question of information flow in entangled 
systems. In Section Pl:.4. II the nature of the question at issue is clarified, before Deutsch 
and Hayden's explanation of quantum teleportation and their introduction of the concept 
of locally inaccessible information is considered fSection |4.4.2(l . Their claims regarding 
the nature of information flow are then evaluated for the conservative and ontological 
interpretations in turn (Section 14.4.3(1 . along axes provided by three questions: i) Have 
Deutsch and Hayden finally giv en the correct acc ount of teleportation, as compared to 



related accounts such as that of 



BraunsteinI l|1996|) ? ii) Is the concept of locally inacces- 



sible information useful? iii) Have they provided us with a new concept of information, 
or quantum information? We close with a brief summary. 



4.2 The Deutsch-Hayden Picture 

Deutsch and Hayden consider a network of n interacting qubits as their model of a 
general quantum system. They take as the object describing the state of the ith qubit 
at time t a triple 

qi(i) = {qi.x(t),qi,y{t),qi,z{t)) 

of 2" X 2" Heisenberg picture operators satisfying the familiar commutation and anti- 
commutation relations of the Pauli spin operators. This object they term the 'descriptor' 
of a system. To see how this representation works, let us first recall the basics of the 
Heisenberg picture. 
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time dependence in quantum mechanics can either be associated with the vector (ket) 
representing the state, or with the operator representing the observable. In the 
Schrodinger picture, the state ket undergoes unitary evolution {\ip) t—^ U\ip)); in the 
Heisenberg picture, the state ket remains unchanged and the basis kets of the 

Hilbert space are evolved {\ai) i— > W\ai)). Another useful way of representing these 
facts is given by the Hilbert-Schmidt representation. 

As we have seen fSection i2.3.2f) . the set oi N x N complex Hermitian matrices forms 
an N'^ dimensional real vector space, Vh{C^), on which we may define an inner product 
{A,B) = Tr{AB), A,B £ V,i(C^) and norm P|| = VXrA^; and just as in our familiar 
examples of vector spaces, e.g. Euclidean M"^, it is useful to define a set of basis vectors 
for the space. We require N'^ linearly independent operators Tj G Vji{C^), and we may 
require orthogonality and a fixed normalisation: TT{TjTjr) — const. Sjj'. 

Recall that any observable can then be represented in this space in the form: 

A = E Tr(Ar,)r, = E "j-r.' (4-2) 

where the Tr{ATj) = aj are the components of the vector A representing the observable 
A. In particular the density matrix p can also be written as a vector: 

e = ^ + E Tr(pr,)r, = E (4.3) 

j=i i=o 

where Tq has been chosen as 1, the identity. In this representation, the expectation value 
of A is just the projection of the vector g onto the vector A: {A)p — Tr(ylp) = (A.p). 
The equivalence between the Schrodinger and Heisenberg pictures now takes on a very 
graphic form. We can either picture leaving the basis vectors (operators) as they are 
and rotating the vector g under time evolution, or we can picture rotating the basis 
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vectors (and hence any observable A) in the opposite sense, and leaving g unchanged. 
In either case, the angle between the two resulting vectors and hence the expectation 
value is clearly the same: A{t).Q = A.Q{t). 

Writing the time dependence out explicitly, we will have, in the Heisenberg picture: 

A(t)=^a,C/t(t)r,t/(i), (4.4) 
j 

while in the Schrodinger picture, 

Q{t) ^J2^TipU\t)r,U{t))T, = J2{^,{t)),T,. (4.5) 
j j 

The expectation value of observable A at time t is simply aj{Tj(t)) p. 

Notice that in both expressions H4.4|l and H4.5|l . the time evolved operators Tj{t) = 
U"^ {t)rjU{t) feature. These operators, along with their expectation values {rj{t))p, will 
be our main objects of interest. 

What should we choose as basis vectors? For N = 2, the set of Pauli operators forms 
an orthogonal basis set, Tr((TiCTj) — 26ij, (we adopt the convention that cfq denotes the 
identity) thus we can choose V^^j G {1, ax, cFy, <Jz} to provide an orthonormal basis 
{Fj}.^ We are then interested in the behaviour of the set {U^{t) {ai/^/2) U{t)}. 

So far, all we have done is translate some very familiar results into the language of 
the space 14 (C^). We now make the all- i mport ant rn o ye tha t provides the core result 



of the Deutsch-Hayden picture (following 



GottesmanI l|l998l) ). That is, we note that 



unitary transformations of operators have the property of being a multiplicative group 
homomorphism^ : 

U^fABU = {U^AU){U''BU). (4.6) 

In other words, the time evolution of a product will be given by the product of the time 
evolution of the individual operators. Thus we do not need to follow the evolution of 
the whole basis set of operators, but only of a generating set. For example, in the N ^ 2 

^The choice of the Pauli operators as a basis set gives us the famihar Bloch sphere representation of 
the density matrix of a two-state system. 

map f : A ^ B is Su group homomorphism if Vai,a2 G A,f{aia2) = /(ai)/(a2). 
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case, noting that cTxCTy = iaz, we see that az{t) = —iax{t)(Jy{t) and that we need only 
follow the evolutfon of the generating set {(Txif^y) to capture the time evolution of the 
whole system. (For completeness, note that cr? = 1; the time evolution of the identity 

is of course trivial.) 

For N = 2", n-fold tensor products of Pauli matrices will provide us with an orthog- 
onal set, thus our basis operators will be 

where the index j runsfromOto (4"— 1) and labels an ordered n-tuple <mi,m2, . . . ,mn>, 
rrii e {0, 1, 2, 3}. We are interested in the behaviour of the 4" Tj{t); again, however, we 
need only track the evolution of objects of the form 

1 1 O . . . O £7j„. O . . . 1, 

which we denote qi^rm', the Tj are given by ordinary matrix multiplication of these 
objects: 

n ^ 

The behaviour of the Tj{t) is thus determined by following the time evolution of a 
minimum of 2n of the qi,mi and taking appropriate products. 

The qi,mi with rrii running from 1 to 3 are, of course, the components of the Deutsch- 
Hayden descriptor qj. This choice of three operators per system as the basic objects 
whose time evolution we are to follow is more than is strictly necessary for a generating 
set, but it leads to a very simple description of an individual system, as we shall shortly 
see. First, however note that the density matrix at time t can now be written as 

mim2---mn i 'Pi 

That is, the 4" components pj{t) of the vector representing the density matrix at time 
t are given by the expectation values of products of the qi,mi{t)- The state of the joint 
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system at time t is thus completely determined by the time evolution of the 2n or 3n 
chosen qi^mt and the initial state p. To see the significance of the triple q^, note that 
any observable on the ith system alone will have the form: 

3 3 

A' = """^^l (g) 1(E) ...(g>al,^(g> ...(g) 1) = aol®" + am,qt,m,, (4.10) 

mi— mi — 1 

thus (li{t) tells us about observables on the ith system at time t and (qi(i))p determines 
their expectation values. Equivalently, the three components of (qi(i))p give us the 
interesting components of the vector Q(t) lying in the subspace spanned by observables 
pertaining to the ith system alone; and with renormalisation, the components, in our 
vector representation, of the reduced density matrix of the ith system. 
Explicitly, this reduced density matrix is: 

/^Xi) = ^E(*.-.W)p<- (4-11) 

mi 

It is also easy to write down the reduced density matrix for any grouping of subsystems. 
If we were interested in the systems i, j and fc, say, taking the partial trace of H4.9|l over 
the other systems will give us a reduced state of the form: 

P''''it)^l J2 (^.m.Wgj.m.Wgfc.mJtWpCTm. ®fT^,. (4.12) 

m^mj mfc 

So we have now seen the basis for the first claim to locality: given just the descriptors 
qi(i) for each individual system, and the initial state p, we may calculate the reduced 
density matrix for each subsystem, and the density matrix for successively larger groups 
of subsystems, up to and including the density matrix for the system as a whole. 

We may note in passing another interesting feature of the Deutsch-Hayden formal- 
ism. A question that often arises, particularly in discussion of quantum correlations, is 
whether different preparations of the same density matrix really correspond to physi- 
cally distinct situations, as all observable properties of systems having the same density 
matrix are identical. A pleasing aspect of the Deutsch-Hayden set-up is that it provides 
a representation in which differences in the way systems are prepared may find direct 
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expression in the formalism'^. For example, it may be the case that (qi(t))p = {qj{t))p 
i.e., the two systems have the same reduced density matrix, but that qi(i) and cij{t) 
differ, representing differences in their histories. 

4.2.1 Locality claim (2): Contiguity 

Let us now consider the second claim to locality. This, recall, was the claim that it can 
be seen explicitly in t he Deutsch-H ayden formalism that local unitary operations have 



only a local effect. As I.Tozsal l|2001|) has emphasized, this aspect of the Deutsch-Hayden 
picture is in fact a re-expression of the no-signalling theorem. 

In the Heisenberg picture, a sketch of a simple version of the theorem would be 
as follows: let us write an observable acting on subsystem i alone as = 1 (g) A; 
at time t, A'{t) = U''{t)(l (g) A)U{t). Suppose U{t) does not act on i, then A^{t) = 
{W (g> 1)(1 (g) A)iU (g) 1) = 1 (g) A, i.e. an observable is unaffected by unitary operations 
on systems it does not pertain to. Now consider our qi^mi ; the foregoing clearly applies 
to them — a unitary operation on a system j does not affect g^.m^ . More generally, 
if our network of n systems were divided up into two subsets of systems, M and N, 
whose members interact amongst themselves but not with systems from the other subset, 
then the unitary operator describing the time evolution of the network will factorise: 
U^^ g) U^. Then the g^,™, for i e M wiU not be affected by , nor those for i e ^ 
by U^^ . We can do more than merely note that the descriptors of a set of interacting 
systems do not depend on unitary operations on a disjoint set, however. In fact we can 
see that the descriptor at time t of a given system will depend, apart from the history 
of operations applied to it alone, only on its previous interactions and on the histories 
and past interactions of the systems it has interacted with. This property may be called 
contiguity; and is best seen with a simple example fFig l4.l| . 

Imagine we have two systems, i and j and that we are going to perform two unitary 
operations. First, at i = 1, we perform Ui, which acts on j alone; clearly, after this 
operation, qi^m-{l) — Ulqi^miUi — qi^mt- Next we allow i and j to interact via U2] now, 

^Although, it must be noted that as we are in the context of no-collapse quantum mechanics, the 
possibility does not obtain of preparing a distant system in a particular way via collapse, d la EPR. 



a) 



t=l 



t=2 



b) 



t=l 



t=2 



U' 



j — u, 



q,(l)=q, qi(2)=U',U\qiU,U, q,(l)= U'^q.U', q,(2)=U'U,U', 
qj(l)=U/qjU, qj(2)=U',U^qjU,U, qj(l)=U'',qjU', qj(2)=U'',U'^qjU',U', 

Figure 4.1: a) At < = 1, a unitary operation, Ui, which acts only on system j is apphed; 
the descriptor of system i, qi(l), is unaffected. After i and j interact via U2 at t = 2, 
however, qi(2) will depend on the operation Ui. In (b) systems i and j initially interact 
via U[. At t — 2, U2, acting on j alone, is applied; qi(2) is unaffected. 

however, (7^^™. (2) = U\u\qi_miU2Ui. Because U2 acts on both i and j, Ui no longer 
factors out; interaction causes the gi^m- to lose the form of a product of a single Pauli 
operator with the identity and they can pick up a dependence on what has happened 
to the system that i has interacted with. We can say that all this remains happily 
local, however, as this dependence on the history of j only arises following an entangling 
interaction between the two systems. The reasoning extends in the obvious way to more 
complicated chains; if j had previously interacted with k, then once i and j interact, 
the qi.rrii (t) pick up what they would not previously have had, a dependence on what 
has happened to fc; and so on. 

To re-emphasize that the Deutsch-Hayden descriptor of a system at time t will not, 
however, depend on what happens at t to a system with which it has interacted in the 
past, we take the following simple example fFig l4.l| . Again consider two systems i and 
j; this time, however, we begin by allowing them to interact via a unitary operation U[, 
then 



g-i,m,(l) 
9j,m,(l) 



Uiqi.7n,U[ ^ q^.m,, and 



(4.13) 



Now we perform J/ji which acts on j alone. Whilst qj^mji'2) — U'lUl^qj^mj U^Ui, for the 
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Figure 4.2: A Bell experiment. An entangled state of systems 2 and 3 is prepared (here 
by the action of a Hadamard gate, H, which performs a rotation by tt around an axis 
at an angle of 7r/4 in the z-x plane; followed by a controUed-NOT operation — the circle 
indicates the control qubit, the point of the arrow, the target, to which ax is applied if 
the control is in the computational state) and the entangled pair is shared between 
two distant locations. A measurement at an angle 9 is performed on 2 and the outcome 
recorded in system 1; a measurement at an angle (j) niade on 3 and recorded in 4. Time 
runs along the horizontal axis. Note that in no-collapse quantum mechanics without 
added values, correlations do not obtain until they are displayed by a suitable joint 
measurement. 

descriptor of i we have 



U2 factors out; there is no immediate dependence on what happens at the present only 
to J, even when i and j have interacted in the past. 

The picture, then, is that following an interaction, the descriptor of a system i 
picks up a backwards looking (and hence what we might call a local, or contiguous) 
dependence on what has happened to the system that i has interacted with, and on 
the previous interactions of that system. As an illustration, let us consider how the 
non-factorisable probability distributions for Bell-type experiments come about in this 
formalism (Fig. 14.2(1 . 

As usual, we begin by preparing a pair of systems (2 and 3) in an entangled state. 
These systems are spatially separated and two local measurements performed, at an 
angle 9 on system 2 and an angle (/> on system 3. The outcomes are recorded into systems 



g,,„,(2) = U[^U!,U..mMU[ = U[U^,mM, 



(4.14) 
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1 and 4 respectively. Immediately following the measurement, the descriptor of system 
1 will depend on 9, but not on the parameter characterizing the distant measurement, 
(f). However, as system 1 has interacted with system 2, its descriptor will also depend on 
what has happened to 2 in the past; which was, in this case, an entangling interaction 
between 2 and 3. Similarly, the descriptor of 4 following the local measurement will 
depend on (j> and not on 9, but will depend too on what happened to system 3 — that 
is, on 3's initial entangling with 2. Because the descriptors of 1 and 4 depend, following 
the pair of local measurements, on the initial entangling interaction between 2 and 3, 
their product can give rise to the familiar non-factorisable probability distribution when 
1 and 4 are subsequently brought together and joint measurements performed. 

It is tempting to think of the contiguity property of the Deutsch-Hayden descriptors 
as depicting a causal chain in which dependence on the parameters characterising the 
history of a system is passed on during interactions, or even more metaphorically, in 
terms of information about the relevant history of a system being transmitted via local 
interactions. More soberly, we see that if the are taken to be the primary objects 
of interest then the effects of local unitary operations on these are indeed explicitly 
seen to be local, as the descriptor of a system cannot come to depend on a parameter 
characterising a unitary operation selected in a distant region without the system having 
undergone an appropriate chain of local interactions. As we have said, however, this is 
just the no-signalling theorem writ large. 

4.3 Assessing the Claims to Locality 

Having outlined the machinery of the Deutsch-Hayden approach, we may now consider 

the status of its claim to provide a particularly local picture of quantum mechanics. As 
remarked in the introduction, it is necessary to distinguish two modes of interpretation 
of the formalism. 
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4.3.1 The Conservative Interpretation 

The conservative interpretation is to take the formahsm at face-value, simply as a re- 
writing of standard (unitary) quantum mechanics, in which we fix the initial state p and 
track time evolution via the qi(t). If we want to talk in terms of properties, we may see 
the qi (t) , against the background of a chosen p, as denoting propensities for the display 
of certain individual and joint probability distributions for measurement outcomes, via 
equations (|4.11(l and l|4.12|l . 

Locality Claim (1) 

The first claim to locality was that the global state can be seen to be determined by 
the states of individual subsystems. What is certainly true is that given the n qi{t), 
the 4" ^j{t) are determined and hence we can keep track of the changes to the joint 
system over time. Note, however, that the initial global state p still has to be specified 
and plays a very important role. It is needed to determine the experimentally accessible 
properties of individual and joint systems; both the Tj{t) and p are required to determine 
expectation values of measurements. That it is the global state is crucial, as in general 
in the presence of entanglement, ((7i,m,(0 <?i,m,(i))p 7^ (qi.rmit)) p{qj,m,{t)) p- 

With the global state of the system still playing such an important role, however, 
it is not clear that we have yet gained much in the way of locality by considering the 
Deutsch-Hayden construction under the conservative interpretation. Taking the simplest 
picture of a time evolving density operator, products of the qi(t) determine how any 
given initial state will evolve; it is no surprise if the initial state of the joint system is 
specified and we have kept track of the changes to the system (albeit that these are fixed 
by the individual qi(t)) that we then know what the final state will be. 

In reply it is open to Deutsch and Hayden to argue that appeal to the global state 
is in fact innocuous, as a standard initial state can always be chosen and the qi(0) 
adjusted accordingly. To be sustained, however, this line of argument commits one to 
the ontological interpretation, which we shall consider in due course. For now, let us 
consider the status of the second locality claim under the conservative interpretation. 



CHAPTER 4. THE DEUTSCH-HAYDEN APPROACH 104 
Locality Claim (2) 

We begin by asking why it might seem important to show explicitly that local unitary 
operations have only a local effect. (We recall, of course, that the standard no-signalling 
theorem already assures us that local unitaries will not have any effect on the probability 
distributions for distant measurements). It is clear that if we were only to consider the 
question of nonlocality as it is usually raised in the context of Bell-type experiments, 
then the Deutsch-Hayden approach would not offer us any distinctive advantages. For, 
as has been mentioned, their point of departure is to assume no-collapse quantum me- 
chanics with no determinate values added, thus the appropriate comparisons must either 
be with an Everettian or a statistical interpretation. But it is well known that the Ev- 
erett interpretation does not suffer from the familiar difficulties with nonlocality in the 
Bell or EPR setting that accrue to theories involving collapse or additional variables 
(indeed, this is often presented as one of the selling-points of the approach); while for 
a statistical interpretation, the familiar no-signalling theorem does all tha t could be 
required to ensure that nonlocality does not arise (see iTimpson and Brownl (|2i]i]2) for 
further discussion and references) . Thus if one is considering the question of locality in 
this context, the crucial factor is the assumption of quantmn mechanics without a real 
process of collapse, and without additional variables, rather than anything distinctive 
about the Deutsch-Hayden approach. 

However, things may look rather less clear-cut when one consi ders the phenomena of 
entan glement assisted co mmunication such as superdense coding ijBennett and Weisner , 



1992|) and teleportation IjBennett et al 



1993(1 . These phenomena vividly illustrate the 



fact that in the presence of entanglement, local unitary operations can have a very 
significant effect on the global state of the system. And might this not indicate a novel 
sort of nonlocality of which even the Everett interpretation would be guilty? If so, 
the Deutsch-Hayden approach would seem to offer a clear advantage, with its explicit 
locality regarding the effects of local unitary operations. 

Consider the example of superdense coding in more detail fFig. l4.3]l . In this protocol, 
Alice is able to send Bob two bits of information with the transmission of a single qubit, 
by making use of the global effect of a local operation. 



B 




Figure 4.3: Superdense coding. A maximally entangled state of systems 1 and 2 is 
prepared by Bob (B). System 1 is sent to Alice (A) who may do nothing, or perform 
one of the Pauli operations. On return of system 1, Bob performs a measurement in 
the Bell basis, here by applying a controUed-NOT operation, followed by the Hadamard 
gate. This allows him to infer which operation was performed by Alice. 



The two parties begin by sharing a maximally entangled state; let us say the singlet 
state. Then, simply by applying one of the Pauli operators to her half of the shared 
system, Alice may flip the joint state into one of the others of the four orthogonal, 
maximally entangled Bell states: a local operation has resulted in a change in the global 
state that is as great as could be — from the initial state to one orthogonal to it. Now, 
if Alice sends her half of the shared system to Bob, he just needs to measure in the Bell 
basis to determine which of the four operations Alice performed, arriving at two bits 
of information. In this protocol, the possibility of changing the global state by a local 
operation has been used to send information in a very unexpected way. The phenomenon 
of teleportation may also be viewed as arising from the fact t hat the set of maximall y 



entangled states may be spanned by local unitary operations l|Braunstein et al. 

So, does the example of entanglement assisted communication indicate an important 
sphere in which Deutsch-Hayden presents benefits of locality? Note that these examples 
do not affect the question of locality for the statistical interpretation, as on this inter- 
pretation the quantum state does not correspond to anything real. But one might be 
interested in a more realist approach. Thus we should ask how the Everett interpretation 
fares with locality in entanglement assisted communiction. 

It can in fact be argued that the examples of superdense coding and teleportation do 
not demonstrate a new form of nonlocality in Everett. Our worry is about the effect on 
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the global state of local operations; however, even if we are being robustly realist about 
it, the global state is not itself a spatio-temporal entity. Thus changes in the global 
state do not correspond to local or to non-local changes. It is better to think in terms 
of changes to properties of the systems; but it is clear that unlike the sort of change 
that would be associated with collapse, the effects of local unitary operations that we 
are considering do not give rise to any changes in local and non-relational properties of 
the separated systems (i.e., locally observable probability distributions are unchanged). 
Thus, although certainly striking, and non-classical, the potential global effects of a local 
unitary operation in the presence of entanglement are not appropriately construed as 
non-local. 

The case is clear enough for superdense coding; teleportation invites a further brief 
comment. When this protocol is analysed from the Everett perspective, the significant 
feature is that immediately following Alice's measurement and before she sends a record 
of her outcome to Bob, Bob's system will already have acquired a definite state related 
to the state Alice is sending, relative to the outcome of Alice's measurement. And this 
may look like a form of non-locality: the pertinent relative state of Bob's system has 
come to depend on the parameters characterising the state being sent by Alice, merely 
as a result of a local operation (measurement) carried out at a distance by Alice, and 
without any direct interaction between the two sides of the experiment. 

It seems that this appearance of non-locality is again not genuine, however. What 
have changed as a result of Alice's measurement are the relative states of Bob's system; 
that is, roughly, relational properties of his system. It is no mystery that relational 
properties can be affected unilaterally by operations on one of the relata and it certainly 
does not connote non-locality'*. The effect of Alice's measurement has been to entangle 
further systems with the initial entangled pair, namely, the system whose state was to 
be transmitted and systems recording the outcome of the Bell measurement. The trick 
is that the type of measurement interaction Alice performs has been chosen such that 
the way in which the systems recording the outcome of her measurement are allowed 

^Consider the following classical example: We have two heaps of sand, x and y, piled on the ground, 
some distance apart. Let us say x is heavier than y. By adding a few more shovel-fulls to y, we may 
make this statement false; but this does not imply a non-local effect on x. 
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to become related to Bob's system (in virtue of the initial entanglement) entails that 
relative to their outcome recording states, Bob's system will have the requir ed states. 



argued to the effect that teleportation does not involve nonlocality, when understood in 
Everettian terms.) 

The conclusion is that when considered under the conservative interpretation, the 
explicit locality in the effect of local unitary operations that the Deutsch-Hayden formal- 
ism provides in the contiguity of changes in the qi(i) does not vouchsafe an important 
sense of locality that would be lacking in an Everettian or statistical interpretation. 
Indeed we can see that it would necessarily be quite misleading to suggest that the con- 
tiguity property points to a novel feature of locality in the Deutsch-Hayden formalism 
interpreted conservatively. As we have noted, the novelty must be supposed to concern 
the absence of any effect on the global state from local unitary operations, even in the 
presence of entanglement; and this indeed follows, in a trivial sense, if we fix the ini- 
tial state p and track time evolution via the qi(i), adopting the Heisenberg viewpoint. 
But what we described in the Schrodinger picture as a change in the global state fol- 
lowing a local operation now merely becomes, in the Heisenberg picture, a change in 
the expectation values for some joint observables that can't be understood in terms of 
changes in expectation values for observables pertaining to subsystems. But why, if we 
were supposed to be worried at all, should we be less worried by changes in these joint 
expectation values as a result of local unitary operations, than in changes to the global 
state? 

4.3.2 The Ontological Interpretation 

Maudlin, in the course of his careful discussion of the question of holism in quantum 
mechanics, arrives at the following dialectical position: 

We now have a reasonably clear question: according to the quantum theory, 
can the physical state of a system be completely specified by the attribution 
of physical states to the spatial parts of the syst em, together wit h facts about 
how those parts are spatiotemporally related? l)MaudlinLll998l p. 50) 



That is, the genuine change is in fact all on Alice's side. IjVaidmanl (jT 




has also 
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In standard quantum theory, the answer, of course, is no. The point of the Dcutsch- 
Hayden approach under the ontological interpretation is to answer instead 'yes'. 

To see how this might be achieved, recah why the conservative interpretation must 
fail to give an affirmative answer to Maudhn's question. 

In the conservative interpretation, the assignment of properties at a given time is 
necessarily a joint venture between the global state p and the descriptors; and as we 
noted f Section I4.3.1|l . appeal has to be made to global properties of the state. The 
q_i{t) cannot themselves be said to denote properties of the subsystems, rather, they 
determine what the effects of dynamical evolution would be for any possible initial state 
of the whole system. It is only when some particular initial state is specified that 
we may begin to talk about the properties of subsystems and of the whole; denoted 
by expectation values of the cii(t) and products of the qi^miit), respectively. And we 
have already noted a crucial feature several times: in general, the properties that are 
assigned to joint systems (expectation values for joint observables, or propensities for the 
display of certain joint probability distributions on measurement), will not be reducible 
to properties assigned to subsystems (individual expectation values and propensities). 

The ontological interpretation departs from this in two ways. First, the status of the 
global quantum state is fundamentally revised. A fixed standard state is adopted by 
convention (for example, the computational basis state |0) |0) . . . |0) ) and it is delegated 
to playing a purely mathematical role in the machinery of the theory, rather than repre- 
senting any physical contingency. Its status is now simply that of a rule for reading off 
the observable properties of systems. Secondly, the qi{t) are taken to represent intrinsic 
(i.e., non-relational) and occurrent (i.e., non-dispositional) properties of individual sub- 
systems. The first feature is required of these properties if the global properties of the 
total system are to be reduced to the properties currently possessed by its subsystems; 
the second feature is a natural requirement in this context. A change in the descriptor 
of a system now represents a change in the actually possessed, intrinsic properties of the 
system. These intrinsic properties are clearly of a new sort; and they do not receive any 
further characterisation or explanation than is provided by their role in the formalism. 
Thus on the ontological interpretation, the content of the first claim to locality is that 
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the global properties of the joint system are reducible to local, intrinsic properties of 
subsystems, while the content of the second is that changes in the global properties are 
reducible to changes in the currently possessed properties of subsystems. Under the 
ontological intepretation, then, we certainly have an interesting thesis. Note that now, 
as adumbrated earlier, changes in the initial conditions of a system may be reflected 
in changes in the qi(0), whereas under the conservative interpretation they would be 
represented by changes in the time-zero density matrix, p(0)^. 

It can hardly be emphasized enough that the approach of the ontological interpre- 
tation marks a considerable departure from our usual ways of thinking about quantum 
mechanics. Indeed it might best be thought of as the proposal of a new theory, in which 
the behaviour of the intrinsic properties denoted by the qi(t) is fundamental^. 

In gaining with respect to reducibility, however, the ontological interpretation ac- 
quires what might be felt to be some rather objectionable features. The first is a problem 
of underdetermination. 

The central, distinctive, claim of the ontological interpretation is that the intrinsic 
properties of a subsystem, denoted by the descriptor qi(t) are fundamental. This means 
that there is a fact about which properties a given system actually possesses at any 
stage; and thus also, a fact about what the true descriptor of the system is. However 
the interpretation also involves a strict distinction between observable and unobservable 
properties. The observable properties are those that are given by expectation values. 
But this means that we can never in fact know the true descriptor of a system. We only 
have empirical access to expectation values and to the density matrices of systems, but 
continuously many different (t) will be compatible with this data. The true descriptor 
of a system could be any one of the many that would provide consistency with both the 
density matrix of the subsystem (eqn. (|4.11|) 'l and that of the total system (eqn. (|4.9|l '). 

half-way house is unsatisfactory. One might adopt a conventional fixed initial state in the con- 
servative interpretation and adjust the qi(0) accordingly, but this would not eliminate the global role 
of the state in determining joint properties, i.e. we do not have reducibility to individual properties, as 
in this interpretation the qi(t) do not represent intrinsic properties. 

®Note, however, that the ontological interpretation of Deutsch-Hayden lacks a measurement theory. 
Although we have a prescription for what the probability distributions associated with various measure- 
ments will be, we do not yet have a description of the measurement process itself, or of the obtaining 
of various outcomes, in terms internal to the theory. It might be thought that some sort of Everettian 
approach could be adopted, but as the relative state finds no place in the Deutsch-Hayden framework, 
it appears, at least prima facie, to be resistant to standard Everettian analysis. 
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Thus the facts about the true descriptors; and hence about the intrinsic properties 
that systems actually possess, although supposedly the fundamental reality, are empir- 
ically inaccessible. According to the ontological interpretation, there is an important 
fact about what the correct descriptors of a set of systems arc, but any assignment of 
descriptors to such a set will necessarily be underdetermined by the accessible data. 

As a corollary of this point, it is worth remarking that the analogy Deutsch and 
Hayden suggest between their descriptors and Einstein's desired 'real state' for sepa- 
rated systems might be overstated. While it may be the case that under the ontological 
interpretation, subsystems do indeed possess independent real states, we would still 
face the epistemological problem that this real state could never be determined by lo- 
cal measurements — we could at most only ever learn the (qi(t))p for a system, when 
presented with a sufficient number of identically prepared systems. 

The second difficulty for the ontological interpretation, and one closely related to the 
undcrdctcrmination problem, is that the shift in meaning of the q,:(t), from determining 
time evolution for any given initial state, to denoting intrinsic properties of subsystems, 
induces a worrisome redundancy. In the normal quantum mechanical picture one can 
think of the qi(t) in the following way. 

Take some fixed sequence of unitary operations performed on a group of systems. 
This sequence will correspond to some particular evolution of the set of <ii{t). Now we 
could consider different initial quantum states for the set of systems; these states would 
evolve variously under the sequence of unitary operations whose effect is captured in 
the evolving q^ (t) . At any given time, the actual quantum state of our group of systems 
could be one from a whole range, depending on which initial state was in fact chosen. 
The evolution of some particular initial state from time to time t may therefore be 
said to depict one history from the range of possible ones. To use the term favoured by 
philosophers, the evolution of this state represents the history of one possible world. A 
choice of different initial state is a choice of different possible world. 

Now the q_i{t) capture the effects of our sequence of unitary operations for all initial 
states. Thus their time evolution can be said to depict the histories of the entire set of 
possible worlds; whilst the world from amongst these that is realised is determined by 
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which initial state is chosen. However, when we move to the ontological view, the very 
same structure (the sequence of time evolving q_i{t)) only represents a single world, as the 
choice of initial state is a fixed part of the formalism. What seems like it can represent 
a range of possible worlds, we are to suppose, can only represent a single one; and 
conversely, the structure being used to describe a single world in the ontological Deutsch- 
Hayden picture is one we know in fact to be adequate to describe a whole set of possible 
worlds in quantum mechanics. Thus the Deutsch-Hayden picture, taken ontologically, 
would seem to be extremely, perhaps implausibly, extravagant in the structure it uses 
to depict a single world. This difficulty, whilst certainly not a knock-down objection 
to the ontological intrepretation, nonetheless seves to highlight some of its unpalatable 
features. 

4.4 Information and Information Flow 

We have seen that under the conservative interpretation, the Deutsch-Hayden formalism 
does not confer any benefits with respect to locality that do not follow directly from 
adopting no-collapse, unitary, quantum mechanics as a basic theory, and hence would 
be equally available with an Everettian interpretation, or, if one were perhaps to allow 
a formal collapse, but deny that it corresponded to any real process, on a statistical 
interpretation. With the ontological interpretation, by contrast, we do find something 
new, but this is better characterized as concerning the reducibility of global properties 
to local intrinsic properties of subsystems, rather than being a question of locality or 
nonlocality. 

One of the most important aspects of the Deutsch-Hayden approach, however, is the 
claim that their formalism finally clarifies the nature of information flow in quantum 
systems; indeed, that it reveals that information can be seen to be transported locally 
in quantum systems, the phenomena of entanglement assisted communication notwith- 
standing. It is to this question that we now turn. Again, the matter must be assessed 
independently for the two different modes of interpretation of the formalism. We shall 
begin, however, with a few general remarks about the topic of information flow. 
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4.4.1 Whereabouts of information 

As we saw in the previous chapter, the puzzle that seems to be posed by the examples of 
teleportation and the like is over the question 'How does the information get from A to 
B7\ This is a perfectly legitimate question if it is understood as a question about what 
the causal processes involved in the transmission of the information are, but recall that 
it would be a mistake to take it as a question concerning how information, construed as 
a particular, or as some pseudo-substance, travels. 'Information' is an abstract noun and 
doesn't serve to refer to an entity or substance. Thus when considering an information 
transmission process, one that involves entanglement or otherwise, we should not feel 
it incumbent upon ourselves to provide a story about how some thing, denoted by 'the 
information', travels from A to B; nor, a fortiori, worry about whether this supposed 
thing took a spatio-temporally continuous path or not. By contrast, we might very 
well be interested in the behaviour of the physical systems involved in the transmission 
process and which may or may not usefully be said to be information carriers during 
the process. 

A second general point concer ns what it might mean to a sk whether or not infor- 



mation is a 'non-local quantity' llDeiitsch and HavdeE , 



2iM 



p. 1759). Note that for 



the reason just stated, information is not something that can be said to have a spatio- 
temporal character, but nonetheless one can, in certain contexts, intelligibly ask 'Where 
is the information?' This question is a fairly specialised one, though: it presupposes that 
we have some specific piece, or type, of information in mind and asks where this may be 
found, in the sense of asking where one might learn, or learn about, the fact, or facts, 
it pertains to. (And, of course, to specify where something may be learnt is not to say 
that what is learnt has to be located there.) Sometimes no very precise answer to this 
question in terms of a designated spatio-temporal region will be possible, or particularly 
helpful. 

As a particular example of the latter case, and one that will figure again later, 
consider the following scenario of encrypting a message. Let us say that Alice and Bob 
are spatially separated but share a secret random bit string, the key. Alice also has in her 
possession a message she wishes to send to Bob, a string of bits denoting something; this 



CHAPTER 4. THE DEUTSCH-HAYDEN APPROACH 113 

is the information we are interested in. At this stage, we can say that AHce's notebook, in 
which the message is written, contains the information. If she then encrypts the message 
by adding (mod 2) the message string to the key, writes the result down (producing the 
cyphertext); and destroys both the original message and her copy of the key, then the 
question 'Where is the information now?' leaves us without a straightforward answer. 
We can't answer by gesturing to AHce's side, or to Bob's side, or to the cyphertext, since 
from none of these, taken individually, may we learn what the message was; although if 
we had access both to Bob's key and the cyphertext then we should be able to learn it. 
A simple request for a location doesn't have a useful answer in this scenario. For this 
reason, we introduce further vocabulary and talk instead of the message being encrypted 
in the cyphertext. It is not to be found wherever the cyphertext is located, rather, it 
may be learnt whenever cyphertext and key are brouglit together, and not otherwise; the 
asymmetry in the roles of the cyphertext and key is captured by the fact that it is the 
cyphertext and not the key in which the message is said to be encrypted (although not 
located). The bald question 'where is the information throughout this protocol?' does 
not, in this case, invite answers with sufficient articulation for a perspicuous description 
of what is going on. 

Deutsch and Hayden, however, have something specific in mind when they raise the 
question of whether in quantum systems, information is a local or non-local quantity. 
If it is the case that a joint quantum system can have global properties that are not 
reducible to local properties of subsystems, then these global properties might be used 
to encode and transmit information in a way that cannot be understood as subsystems 
individually carrying the information. This is what they would mean by information 
being a non-local quantity. The issue is whether we can, in general, always understand an 
information transmission process involving quantum systems in terms of the properties of 
subsystems being used to carry the information. The examples of entanglement assisted 
communication, as usually understood, would strongly suggest otherwise. 

We shall focus on teleportation as the most interesting case; and one which displays 
the characteristic features at issue. 
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Figure 4.4: Teleportation. All systems begin in the computational basis state. Bob (B) 
creates a maximally entangled state of systems 4 and 5. System 1 is prepared in some 
unknown state by a rotation depending on the parameter 9. When system 4 is sent 
to Alice (A), she performs a measurement in the Bell basis, recording the outcome in 
systems 2 and 3. Systems 2 and 3 are transported to Bob, who performs a controlled- (t^ 
operation on 2 and 5, and a controUed-NOT on 3 and 5. System 5 is left in the original 
unknown state \x) ■ 



4.4.2 Explaining information flow in teleportation: Locally ac- 
cessible and inaccessible information 

Let us recall once more what the teleportation protocol looks like in the absence of 
collapse (Fig. ^31- Sharing a maximally entangled state with Bob, Alice performs a 
joint measurement on her half of the entangled pair (4) and on a system (1) prepared 
in some unknown state, with the result that the state of Bob's system (5), relative to 
the outcomes of her measurement, is changed in a way that relates systematically to the 
unknown state to be teleported. At this stage of the protocol, every system involved is 
now in a maximally mixed state, i.e., the information that characterises the unknown 
state will not be available to local measurements. As we have seen, the protocol continues 
with the sending of the systems (2 and 3) recording the outcome of Alice's measurement 
to Bob, who can now perform the conditional unitary operations required to disentangle 
his system (5) from the others, in such a way that it ends up in the original, unknown, 
state. The information characterising the unknown state is now available again to local 
measurements, but this time, only at Bob's location. 

The crucial feature in this protocol is the change in the relative states that is allowed 
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by the global property of entanglement. Subsystems, therefore, do not seem to be playing 
the role of information carriers in teleportation, and this conclusion is further supported 
by the fact that the only systems that are sent from Alice to Bob during the protocol 
are both maximally mixed. 

Deutsch and Hayden, though, wish to give an account of teleportation in which infor- 
mation flow is local; that is, in which subsystems can indeed be seen to carry information 
from Alice to Bo b. In particular, they are concerned to rebut claims such as that of 
Braunstein ( 199(t1 i. who suggests that the information characterising the unknown state 
is contained in the global system rat her than i n sub systems during the protocol; or the — 
by now very familiar — approach of IPenrosd 1)19981) . who suggests that the information 



must flow along a channel constituted by the initial shared entanglement between Alice 
and Bob, flrst backwards, and then forwards again in time. 

Clearly, a good starting point for the debate would be an appropriate criterion for 
when a system may be said to contain information. Deutsch and Hayden would seem to 
have one of two slightly different necessary and sufficient conditions in mind, although 
they are not explicit. 

They begin by introducing a fairly familiar sujficient condition for a system S to 
contain information about a parameter 9: If a suitable measurement on S would display 
a probabilistic dependence on 6, then S may be said to contain information about 
9. Then a necessary condition for containing information is presented: S can be said 
to contain information about 9 only if its descriptor depends on 9. These definitions 
motivate an informal argument of roughly the following form: Let us say we have a group 
of systems that includes S\ denote this group by 5* U S^. Assume that the descriptor of 
S alone depends on 9. If we know that the group 5*0 S*^ as a whole contains information 
about 9, because global measurements would display suitable probabilistic dependence, 
but does not (as the descriptors of the systems in do not depend on 9), then the 
information must be in S*, in virtue of 5's descriptor depending on 9. Therefore from 
the fact that the descriptor of S depends on 9, we may infer that it contains information 
about 9. 
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This conclusion would be underwritten by either one of the following two definitions'^: 

Definition 3 S contains information about 6 its descriptor depends on 6 

Definition 4 S contains information about 9 ^ its descriptor depends on and m,ea- 
surements on the global system S U would display a probabilistic dependence on 9. 

These two definitions differ as it is possible for the (\i{t) to depend on 9, but for p{t) 
not to (recall the problem of underdetermination). The second is rather more natural, 
particularly if wo arc to tic the notion of information being used to the context of definite 
communication-theoretic procedures. 

With one of these definitions of containing information in hand, Deutsch and Etay- 
den's claim for the locality of information flow follows directly from the contiguity prop- 
erty of the changes in the (\i{t). The proposal is that teleportation should now be 
understood in the following way. System 1 is prepared in some state characterised by 
the parameter 9\ its descriptor now depends on 9. Following Alice's Bell-basis measure- 
ment, the descriptors of the 'message qubits' 2 and 3 also come to depend on 9. These 
two systems, as they are transported, carry the information about 9 to Bob's location, 
where, following a suitable local interaction, the descriptor of his system (5) also comes 
to depend on 9. We must note the further, crucial, point, however, that the systems 
2 and 3 carry the information to Bob in a locally inaccessible manner. Although their 
descriptors depend on 9, and hence the systems may be said to carry information under 
the Deutsch-Hayden definition, this dependence may not be revealed by measurements 
on the systems individually — their reduced density matrices are maximally mixed. 

Deutsch and Hayden define locally inaccessible information as information that is 
present in a system, but that may not be revealed by individual measurements on the 
system. The explanation of teleportation, then, is that the message qubits do actually 

carry the information characterising the unknown state to Bob, but they do so locally 

'^The two statements that follow must be understood as proposed definitions, as they are not entailed 
by Deutsch and Hayden's argument, just sketched. The argument uses the necessary and the sufficient 
condition for containing information, and the rule of inference: if a group of systems contains information 
about 6, and a subgroup does not, then the complement of that subgroup contains the information about 
d. However if we have more than one system whose descriptor depends on 6, then all that the argument 
based on these principles allows us to conclude is that their union contains the information, not each 
system individually, which is the desired conclusion. 
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inaccesibly. The general conclusion is that subsystems can always be thought to carry 
information in entanglement assisted communication protocols (hence 'information is a 
local quantity'), it is just that these protocols involve locally inaccessible information. 

4.4.3 Assessing the claims for information flow 

How satisfactory is this account as an explanation of teleportation, and, indeed as a gen- 
eral picture for information transmission in quantum systems? We shall consider three 
questions: First, have Deutsch and Hayden finally given the correct account of telepor- 
tation, as opposed, say, to Braunstein? Second, is the concept of locally inaccessible 
information useful? Third, do Deutsch and Hayden provide us with a new concept of 
information, or quantum information? We must consider the answers to these questions 
for the two modes of interpretation of the formalism in turn. 

Before that, a preliminary remark. Recall that as properly understood, the question 
'How does information get from Alice to Bob?' is a question about the causal processes 
involved in the transmission. It is clear that simply answering: 'the information is 
carried in the message qubits' would not be enough to explain teleportation on its own, 
as this information might never be made accessible again at Bob's location, or it might 
be made locally accessible, perhaps, but not in such a way that Bob's system is actually 
to be found in the original unknown state. Obviously, the explanation has also to refer 
to the role of the initial entanglement and the changes in the global properties of the 
system that this entanglement allows, and which the teleportation protocol exploits. 
This suggests a moderate way of imderstanding the application of the Deutsch-Hayden 
formalism in teleportation that would not involve commitment to their claims about 
locality or information flow. 

On this view, the advantage their formalism presents is simply in highlighting the 
difference in roles played by the initial entanglement and the message qubits in telepor- 
tation. The asymmetry in these roles is, as Deutsch and Hayden point out, analogous 
to the asymmetry in the roles of the key and cyphertext in classical encryption based on 

a shared secret random string^. Before the final stage of the protocol, it is the message 
®The analogies and, importantly, disanalogies, between entanglement and shared secret bits are 
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qubits, and not Bob's qubit, that have had the direct dynamical couphng to the system 
whose state is to be teleported (reflected in the fact that their descriptors depend on 
9) — compare with the classical cyphertext, which is generated from the message. But 
it is the correlations that are established between the relative states of the message 
systems and Bob's qubit, in virtue of the initial entanglement, that allow the unknown 
state to be recovered by Bob. (Similarly, the classical correlations between the key and 
cyphertext allow the encrypted message to be recovered). This suggests that it may 
well be useful to distinguish between the question of whether an analysis in terms of the 
qi(t) helps us understand an aspect of teleportation; and whether the account in terms 
of information flow does so. 

Returning to our three questions. The adjective 'correct' in the first question might 
be understood in one of two ways; either correct simpliciter, or correct given the back- 
ground assumptions. In order to be correct simpliciter, the account of teleportation 
would clearly have to be, first of all, correct given the background assumptions, while 
these background assumptions themselves also have to be correct. The relevant back- 
ground assumption when we consider the conservative interpretation is t hat unitary 



no-co llapse) quantum mechanics is our setting; this is the setting also for 



Braunstein 



(1993), hence the point of the comparison. 



Conservative interpretation 

From the previous remarks on the conservative interpretation, we know that the assign- 
ment of properties to systems involves both the global state and the qi{t): we do not 
have reducibility of global properties to properties of subsystems and therefore subsys- 
tems cannot, after all, always be thought to carry information in entanglement assisted 
communication. It makes no odds whether one adopts the Heisenberg or the Schrodinger 
viewpoint, it is still the case that joint (and irreducible) properties of subsystems are 
being used to carry information in the protocols. In Braunstein's account of teleporta- 
tion, after Alice's Bell-basis measurement, the information characterising the unknown 
state is said to be in the correlations between the message qubits and Bob's qubit, 

developed in detail in lCoUins and PopesciJ i2002t) . 
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i.e., it is carried by certain joint properties of these systems. The same is true in the 
Deutsch-Hayden setting, understood conservatively; so we are not in fact being offered 
a substantially different account of teleportation. This entails part of the answer to the 

second question. 

Under the conservative interpretation, there is an important sense in which there is 
no difference between saying that a system contains locally inaccessible information and 
saying that the information is in the correlations. In both cases this would translate into: 
the information is carried by joint, and not individual, properties of subsystems. One 
can frequently make perfectly good sense of a system being said to contain information 
about a parameter if a suitable measurement on the system would display a probabilistic 
dependence on the parameter, for then one can learn something about the parameter 
by performing the measurement. But if the information is locally inaccesible, then this 
means either i) for some different initial state of the global system then there will be 
a probabilistic dependence for the local nicasnrement but this would be physically 
irrelevant to the situation actually being considered: or ii) for some measurement on the 
global system, a probabilistic dependence on the parameter will be displayed — and this 
is no different from what one would say on Braunstein's account. 

So where, if anywhere, does a difference lie? In marking an asymmetry. But note 
that the pertinent aysmmetry may also be understood in a Schrodinger picture account 
such as Braunstein's. In teleportation, the point being emphasized is that it is the 
message qubits, and not Bob's qubit, that have had the direct dynamical coupling to 
the system that was prepared in the state characterized by the parameter 0; and this 
is clear enough without invoking locally inaccessible information. (The significance, of 
course, is that we know from the no-signalling theorem that dependence on a parameter 
chosen in one region may not be displayed in another unless there has been a direct, 
or indirect, dynamical coupling between systems from the two regions.) Another way 
to mark the asymmetry would begin by pointing out that the initial entanglement, the 
sending of the message qubits to Bob, and the correct sequence of unitary operations 
being performed by Alice and Bob, are individually necessary, and jointly sufficient 
conditions for a successful teleportation protocol. If we were to miss any one of these 
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out, then the protocol would fail, but evidently, for different reasons in each case. 

The preceding discussion indicates that under the conservative interpretation, the 
concept of locally inaccessible information is not playing a very useful explanatory role. 
It is misleading to suggest that the message qubits really carry anything — at best this is a 
roundabout way of saying that joint properties do^. This conclusion in turn casts doubt 
on the value of adopting cither of the proposed definitions of containing information in 
the context of the conservative interpretation. 

However, it would be precipitate to conclude from this that we may in fact learn 
nothing from the analysis of teleportation in the Deutsch-Hayden formalism. As sug- 
gested earlier, one can distinguish between the description using the qi(t) being useful 
and the concept of locally inaccessible information being so. Deutsch and Hayden are 
certainly right that an analysis in terms of their descriptors does help emphasize the 
important asymmetry between the roles in the protocol of sending the message qubits 
and the existence of the initial entanglement; and due consideration of this asymmetry 
contributes, for example, towards undermining the plausibility of a Penrose-type expla- 
nation. The analogy with the cyphertext and key is also enlightening in this regard. 
But as we have just noted, it is quite possible to mark this asymmetry without needing 
to invoke talk of containing information, which has potential to mislead. 

The answer to the third question under the conservative interpretation is perhaps 
the most intriguing. We have seen that locally inaccessible information does not figure 
successfully in an attempt to retain subsystems as information carriers in the presence 
of entanglement, but have Deutsch and Hayden nonetheless succeeded in shedding light 
on the — sometimes obscure seeming — concept of quantum information? They say, for 
example: 

...it is impossible to characterize quantum information at a given instant 
using the state vector alone. To investigate where information is located, 
one must also take into account how the state came about. In the Heisen- 
berg picture this is taken care of automatically, precisely because the 
Heisenberg picture gives a description that is both complete and local. 
l)Deutsch and HavdenL Eoool p.l773) 

® Recall from the comments in Section |4.4. II and the previous chapter that we are not forced to say 
that the information must be located in one system rather than another, or that it is carried by one 
system rather than another. The assumption that we must is predicated upon the misleading picture 
of information as a particular or substance. 
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It seems, though, that this suggestion would incorporate a number of confusions. 

While it is true that the c{i{t) provide more information than simply following the 
time evolved state would, this is not information about the time evolution of particular 
systems that the latter description lacks. The qi{t) look more informative because 
they capture time evolution for any given initial state, thus they say more about the 
dynamics a system has been subject to; but in the conservative interpretation, this is 
not to say more about the system, but rather about the unitary operators. This extra 
information that one gets is not then 'complete', i.e., information that would be lacking 
in the description of a given network of systems in the Schrodinger picture, but is given 
one in Heisenberg. Instead, it is information about something else; about how other 
systems, prepared in a different way would react, or information about, for example, the 
fields that have driven the systems' evolution. 

Furthermore, one can readily accept that one has more information if one knows how 
the state came about, but deny that this information is a property that has to be located. 
So again, one can, in fact should, deny that there is information located with systems 
that is lacking from the state vector picture. The 'extra' information represented in 
the c{i{t) consists of facts about the unitary operations undergone; and this information 
cannot be said to be here, there, or anywhere, as it makes no sense to ask w here these 
facts are. Facts are of the wrong logical category to possess a location (cf. 



Strawson 



1 19501) ). The underlying thought seems to be that the description in terms of the q^i{t) 



allows us to 'd etermine where the informa tion about a given parameter is located at a 



given instant' l|Deutsch and Havden . 



2000. p. 1771). But note that the question 'Where 
is the dependence on the parameter?' could be a bad question; one inviting us to confuse 
the description of a thing with the thing itself. It is what depends on the parameter 
that is important; and in entanglement assisted communication, under the conservative 
interpretation, this will often only be joint, and not individual properties. 



Ontological interpretation 

The discussion of our three questions for the ontological interpretation may be some- 
what more brief. As to the first: on the ontological interpretation, global properties are 
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reduced to intrinsic properties of subsystems, therefore, the properties of subsystems 
may indeed be thought to be carrying the information in entanglement assisted com- 
munication protocols. Thus, adopting the Deutsch-Hayden formalism understood in the 
ontological way, we would have an explanation of tcleportation in which the information 
that the system carries as a whole can be thought a consequence of information being 
carried by subsystems; in which information is genuinely carried between Alice and Bob 
in the message qubits during tcleportation. (Of course, this explanation may not be re- 
flected back onto our more usual ways of understanding quantum mechanics, but relies 
on the ontological interpretation. As such it has no power to confute opposing views, 
such as Braunstein's, that derive from a different set of assumptions.) 

Why does it now seem acceptable to say that information is carried in subsystems, de- 
spite the fact that it may not be possible to learn anything by performing measurements 
on an individual system? Because in the ontological interpretation, the explanation 
of the physical processes by which information is transmitted from A to B (answering 
'How does the information get from A to -B' in the legitimate way,) involves the intrinsic 
properties of subsystems denoted by the qi(i). In contrast to the conservative interpre- 
tation, we are now able to answer the question 'What depends on the parameter?' with: 
the intrinsic properties of subsystems. As the intrinsic properties of subsystems are be- 
ing used as the information bearing properties under the ontological interpretation, the 
definitions given above of containing information would have a point 

Regarding the usefulness of the concept of locally inaccessible information, the pur- 
pose of the introduction of this category is to recognise that there are two ways in which 
a system may be said to carry information in the ontological interpretation; either in its 
observable, or in its unobservable, empirically inaccessible, properties. This distinction is 
necessary for the explanation of entanglement assisted communication in the ontological 

interpretation, thus the introduction of the category is useful. 

Although it is not clear that they are wholly trouble-free. Under definition (1), for example, there 
will be cases in which a system is said to contain information locally inaccessibly, but where it could 
never be made accessible, i.e. could never be displayed even under global measurements. This would 
tend to undermine the plausibility of the claim that the system does in fact contain information, which 
casts doubt on the acceptability of the definition. So again, definition (2) would seem preferable. But 
it might be beneficial to restrict talk of containing information still further, to cases in which some 
particular information transmission protocol is envisaged, or in which an agent would stand to learn 
something by performing measurements on a group of systems. 
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In answer to our third question, however, it is important to recognise that the on- 
tological interpretation of Deutsch-Hayden is not providing us with an account of a 
new type of information, but of new properties, new ways in which information may 
be carried. Again, because this turns on the details of the ontological interpretation, 
it cannot be taken to provide us with a new understanding of information, or quan- 
tum information, that could be transferred back to more familiar quantum mechanical 
settings. 

4.5 Conclusion 

Deutsch and Hayden present their formalism as an avowedly local account of quantum 
mechanics, which finally clarifies the nature of information transmission in entangled 
quantum systems. To what extent is this successful? We have seen that in order to 
assess the claims of locality, and the claims regarding the nature of information fiow, 
it is essential to distinguish between a conservative and an ontological interpretation of 
the formalism, as very different conclusions follow. To summarise: 

On the conservative interpretation, there are no benefits with respect to locality 
that do not follow immediately from adopting a version of quantum mechanics in which 
there is no genuine process of collapse and no additional properties added (and which, 
consequently, would be shared by an Everettian or a statistical interpretation); thus no 
distinctive feature of the Deutsch-Hayden approach is in play. As far as information 
transmission is concerned, the formalism does not show that information is after all, 
a local quantity (in Deutsch and Hayden's sense), as it remains the case that joint, 
rather than individual, properties are used to carry information in entanglement assisted 
communication protocols. The explanation proffered of tclcportation does not differ 
in substance from that which would be given by an account sharing the same initial 
assumptions, such as that of Braunstein. Furthermore, we have seen that it would 
be confused to think that the description in terms of the qi(i) fiUs-in an account of 
information, and where it is located in quantum systems, that is missing in the usual 
Schrodinger picture. The additional information the q_i{t) provide (when they do so) 
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consists of certain facts about the unitary operations undergone (not information carried 
by systems); and it makes no sense to propose that these facts have a location. 

With the ontological interpretation, on the other hand, we have an interesting re- 
sult; although one better characterized as regarding the reducibility of global properties 
of quantum systems to individual properties, rather than as a question of locality or 
nonlocality. With this reducibility, the claim about the locality of information trans- 
mission, even in the presence of entanglement, follows. However, as the ontological 
interpretation provides a picture which differs so markedly from our usual ways of un- 
derstanding quantum mechanics, these results clearly cannot be taken to shed light on 
the nature of information flow in entangled quantum systems when we have not taken 
the dramatic step of introducing an entirely new range of intrinsic properties of sys- 
tems. And reducibility does not come free: one is confronted with an unpleasant form 
of underdetermination and the bogey of redundancy. 

Unfortunately. Deutsch and Hayden do not distinguish the two different modes of 
interpretation of their formalism; indeed they are arguably conflated, to deleterious 
effect. The reason to believe that they must have something along the lines of the 
ontological interpretation in mind is that their main claims would not be true in any 
interesting way otherwise; but at certain points they would seem to suggest clearly that 
the conservative reading is correct: when they imply that it is merely the move to the 
Heisenberg picture which does the work (p. 1759); when suggesting that they have simply 
provided a reformulation of Schrodinger picture quantum mechanics (p. 1773). As we 
have seen, however, if there is equivocation between the conservative and the ontological 
interpretation, then it is impossible to draw any conclusion regarding information flow 
and locality. 

So, having drawn this all-important distinction, the conclusion of our discussion is 
that in the ontological interpretation, we have a bold thesis which might be adopted, 
despite its objectionable features, in order to obtain reducibility of global properties to 
local properties, if this was thought particularly desirable for some reason. Retaining 
the conservative approach, on the other hand, we would have a formalism with some 
occasionally useful features, but not one which provides a novel sense of locality, nor. 
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indeed, of information flow. En route, the discussion should have shed some more light 
on the puzzles that so often seem to surround the question of information transmission 
in entangled quantum systems. 



Chapter 5 

Characterizing Entanglement 
in the Deutsch-Hayden 
Formalism 



In the previous chapter, frequent reference was made to the fact that in general in quan- 
tum mechanics, the properties assigned to joint systems are not reducible to properties 
possessed by individual systems: this is the result of entanglement. (The ontological ap- 
proach to Deutsch-Hayden is then seen as a theory with new types of intrinsic properties 
of subsystems which allow such reduction, even in the presence of entanglement.) 

A natural question to consider next, therefore, is how one may characterize entan- 
glement within the Deutsch-Hayden formalism. Based as it is on the Hilbert-Schmidt 
representation, one may also hope to to gain some geometric insight into entanglement, 
given the pleasing geometrical picture of quantum states that the formalism provides. 

In this chapter we will be concerned with bipartite entanglement, pure and mixed, 
in the Deut sch-Hayden formalism. (R e lated i nvestigations of the en tanglement of 2 (g) 2 



systems are 



Horodecki and Horodeckil 1 1996 ) 



Kummer 



1999ll200lj) .) As we have seen, 



one of the primary benefits of the formalism is in tracking the history of the evolution of 
systems. However, for the bare question of whether or not certain systems are entangled, 
the details of the dynamical history are irrelevant — we need only consider whether the 
density matrix is entangled or not. The relevant form of an n-system density matrix 
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will therefore be (cf. eqn. H4.9|l Chapter 2J: 

1 / " \ " 

■mim2...rn„ ^k=l 'pW k=l 

For the case of 2 (g) 2 systems (now dropping the explicit time parameter) the density 
matrix may be written in the convenient form: 

e=i(^l(g)l + a.cr«)l + l(g) b.cr + ^Cijai (g) cr^^, (5.2) 

where a^, bi and Cy — 1 . . . 3) are the expectation values of the operators ai 1, 
1 (8) cTi and ai®aj, respectively, i.e., the values {qi^ijp, {q2,i)p, {qi,i<l2,j)p in the previous 
notation. 

In this form, the 3- vectors a, b are the Bloch vectors for the reduced density matrices 
of systems 1 and 2 respectively, hence determine the expectation values for individual 
experiments; while the 3 by 3 matrix with components Cy — which we will term the 
correlation matrix — determines the results of joint experiments. It is helpful to bear 
in mind that g in eqn. (|5.2|l is itself a unit vector (in the 16 dimensional real space of 
Hermitian operators on (X" C^) which may be expressed in column form as: 



1 

^=2 



a 
b 



Let us review some terminology. A state is called entangled if it is not separable, 
that is, for bipartite systems, if it cannot be written in the form: 

I*)i2 — |0)i|V')2, for pure, or p^^ = ^ X^p] (g) pi, for mixed states, 

i 

where Xi > 0,J2i — ^ and 1, 2 label the two distinct subsystems. The case of pure 
states of bipartite systems is made particularly simple by the existence of the Schmidt 
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(5.3) 



where {iV'i)} ^-re orthonormal bases for systems 1 and 2 respectively, and pi 

are the (non-zero) eigenvalues of the reduced density matrices of the subsystems. The 
number of coefficients in any decomposition of the form (|5.3|l is fixed for a given state 
1 5') 12, hence if the state is separable (unentangled) , there is only one term in the Schmidt 
decomposition, and conversely. Pure state entanglement also has a simple relation to 
Bell inequality violation. Entanglement is a necessary condition for Bell inequality 
violation; and for bipartite and n-partite pure stat e entanglement, it is sufficient too 



I Gisin and Peres 



1992; 



Popescu and Rohrlich . 



1992). That is, all pure entangled states 



violate some Bell inequality. 

Things are more complex for mixed state entangl ement. The si mple Schmidt de- 



Wernen l|1989|) that some mixed 



composition test does not exist; and it was shown by 
entangled states do not violate any Bell inequality: remarkably, entanglement is not 
sufficient for Bell inequality violation^. The provision of a simple necessary and suffi- 
cient condition — the positive partial transpose condition — for mixed state entanglement 
in 2 (g) 2 and 2 (g ) 3 dim ensional systems, was finally achieved by the Horodeckis in 



I Horodecki et al 



1996aj) . One of our aims in this chapter will be to gain some under- 
standing of the positive partial transpose condition in the Deutsch-Hayden formalism. 
Throughout we will move freely from positive to contrapositive forms of statement, so 
if a property P is a necessary condition for separability, say, then -^P is a sufficient 
condition for entanglement; and so on. 



5.1 Background 

It will be useful to review some of the pertinent results, beginning with a summary of 
the Horodecki's positive partial transpose condition. 

^Although it was later shown bv iPopescul ilQflfiD that Werner states for dimensions greater than or 
equal to five could be made to violate a Bell i nequality if sequential measurements are allowed — this is 
termed hidden nonlocality. See lBarretti J2002D for further discussion. 
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5.1.1 Entanglement witnesses and the Horodecki's PPT condi- 
tion 

We will begin with the concept of a positive map. A hnear map is said to be positive if 
it maps positive operators to positive operators. So if ^ is a positive operator^, A> 0, 
and A is a positive map, then A(A) > 0. A positive map A is termed completely positive 
if, when we consider adding a further system and so enlarging the Hilbert space, the 
extended map 1 (8) A is a positive map for operators on the larger space. (Completely 
positive maps correspond to the most general form of quantum dynamics, see, e.g.. 



Nielsen a,nd Chua,nd ^200(1 Chr^t. 8).) 

The transpose operator T is an example of a map which is positive, but not com- 
pletely positive. The effect of T on an operator is defined in terms of its effect on the 
operator when written in a particular basis: {TA)ij — Ajj = Aji, i.e. it is just the 
familiar process of matrix transposition. The partial transpose is when T is applied to a 
subsystem only. So consider an operator ^ on a tensor product Hilbert space (E) H^, 
expressed in terms of a product basis {|?)i|A:)2}. Its matrix components will be Aik_ji, 
where the indices i,j refer to the first subsystem and fc, I to the second. The effect of 
taking the partial transpose on the second system, (1 (g) T)A, will be: 

A^^ — A , ■, 

ik,jl ~ ^tljk- 

Th us the indices referring to system 2 are permuted. 



Peres! I|1996|) noticed that the partial transpose could be used to provide a necessary 
condition for separability. Since T is a positive map and doesn't affect the trace of an 
operator, its effect, Tp, on a density operator will be to produce another valid density 
operator, . Thus the effect of the partial transpose on a separable density operator is 

^By a positive operator I mean what is sometimes called more precisely a positive semi-definite 
operator. An operator A acting on a Hilbert space "H is positive semi-definite if V|'0) S Ti, (4>\A\ip) > 0. 
If the operator A acts on a complex vector space, as in quantum mechanics, then positivity of A implies 
that it is Hermitian. 
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(1 T)p,ep = (1 ® T) ^ X^pl = ^ X^pl {p^)I (5.4) 
i i 

hence if a state is separable, it will remain positive under partial transposition. As the 
transpose is not completely positive, however, there exist states which do not have a 
positive partial transpose; and these must therefore be entangled. The positive partial 
transpose condition proved better than Bell inequality and entropic {vide infra) criteria 
for distinguishing the entanglement of Werner states, thus it was natural to conjecture 
that having a positive partial transpose (PPT) is also a sufficient condition for entan- 
glement. 

This conjecture was proven to hold for 2 (g) 2 and 2 (g) 3 dimensional systems by 
the Horodeckis; but also to fail in higher dimensions — then PPT is only a necessary 
condition. 



Horodecki et al 



1 1996aj) begin by introducing the concept of an entanglement wit- 
ness'^. An entanglement witness is an Hermitian operator W which has a positive 
expectation value for all separable states (Vp(p is separable Ti{W p) > 0)), but nega- 
tive expectation value for at least one entangled state. Thinking in terms of the vector 
space of Hermitian operators, Ty{W p) — defines a plane (normal to the vector W and 
containing the origin) on one side of which lie all the separable states, Tt:{W p) > 0, 
while on the other, lies at least one entangled state. 

^The name, though, is due to Terhal, e.g. iTerhali l2n0(ll . 
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Now, it is clearly true by definition that 
Proposition 1 
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p is separable VA(^Vp'(p'is separable Tr(Ap') > O)^ Tr{Ap) > oj 



where A ranges over Hermitian operators and p' over density matrices. The Horodeckis 
poin t out that it is a conseq uence of the Hahn-Banach theorem that the converse holds 



too IjHorodecki et al 



1996a|l . The set S of separable states is a convex set, an entangled 



state p lies beyond this set; and for any entangled state there exists a plane separating 
it from S. Thus for any entang led state there is an ent anglement witness, namely, the 



operator W defining this plane IjHorodecki et al 



1996al Lemma 1). 



The contrapositive statement is that if for a state p, Tr Ap > for all A which have 
positive expectation values on all separable states, then p is separable (because if it 
weren't, there would be some A — the entanglement witness — for which Tt{Ap) would 
be less than 0). We thus have 
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Proposition 2 



p is separable ^ VA(^Vp'(p'is separable -> Tr(Ap') > O)^ Tr(A/9) > 



Noting that Vp(p is separable — > Ti{Ap) > 0) is logically equivalent to 



VPVO(Tr(AF (g) g) > 0), 



where P and Q are projectors on Ti^ and respectively^, we reach 



Theorem 1 (Horodeckis, 1996) 



p is separable ^ VA 



Tt{AP ® Q) > 0) ^ Tr(Ap) > 



Again, the forward implication is true trivially; it is the converse, based on the lemma 
concerning entanglement witnesses, that is profound. The next step is to restate Theo- 
rem 1 in terms of positive maps^. 

Here use is r nade o f the isomorphism between positive maps and operators on Ti^®Ti'^ 



Horodecki et al 



1 1996al) note that a map A will be positive iff 



the associated operator 5(A) is Hermitian and Tr(5(A)P(X) Q) > 0, for all P, Q on H^, 
Ti? respectively. This allows us to replace the quantification over Hermitian operators 
A in Theorem 1 with a quantification over maps A. The isomorphism 5(A) is chosen as 
5(A) = (1 (K) A)Po, where Pq is the projector onto a maximally entangled state; and we 
reach: 



''For the forward implication, if TrAp > for all separable states, it is so for all tensor products 
of 1-d projectors; for higher dimensional projectors, just take sums of 1-d cases. For the converse, 
if VPVQ(Tr(AP ® Q) > 0), then Tr(ylPi Qi) > 0, where Pi,Qj are 1-d projectors; and it follows 
that Tr(A J]^ \Pi ® Qi) > 0. But any separable p may be written as pscp = X]i ^iPi ® Qii therefore 
Tr(ylpsoD) > 0. QED. 

'""In (Horodecki ct al .1 119963) . positive maps between spaces of operators with differing dimensionality 
are considered, to allow for systems where AimH} ^ dimW'^. This complication will be suppressed in 
the following. 
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Proposition 3 p is separable ^ VA(A +ve Tr(l (g) APqp) > 0). 

It is tlien argued tliat tlie condition on the trace in the consequent of the RHS of prop. 
Olis equivalent to the condition Tr(l (g) ApPg) > 0, i.e., a positive map is now considered 
acting on p, rather than Pq. Note that 

Proposition 4 VA(A +ve -> 1 Ap > O) ^ VA(A +ve Tr(l ApPo) > O), 

as if 1 (g) Ap is a positive operator it will have positive expectation value with all pro- 
jectors, including Pq. This, together with prop. |21 entails the backwards implication in 
the following theorem: 

Theorem 2 (Horodeckis, 1996) p is separable ^ VA(A+-!;e — > 1 (g) Ap > 0). 

Again, the forward implication is straightforward, while it is the converse that rests, via 
props. I4I3I on the lemma concerning entanglement witnesses (prop. 121. 
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Theorem 2 succeeds in characterizing fully the property of separability in terms of 
the requirement of remaining positive under all positive maps; however it does involve 
a quan tification over all positiv e maps. To reach a manageable operational characteri- 

note that for 2 (g) 2 and 2 (g) 3 dimensional systems, the 



Horodecki et al 



zation, 



positive maps l|Stgirmeii 1 1 963i I Woronowica . Il 97(11 . Thus l(X)Ap > for all positive maps 
A is true if (and only if) 1 (g) Tp > 0, as the only way in which 1 (g) Ap could fail to be 
positive is if 1 (X) Tp fails to be; and we finally reach the conclusion 



are decomposable in the form: A 






Iworonowicz 





Theorem 3 (Horodeckis 1996) A state p acting on C 
if and only if its partial transpose is a positive operator. 



is separable 



5.1.2 The majorization condition 

It is a remarkable feature of entanglement that the state of a joint system may be pure 
while the states of the individual subsystems are mixed. It is this aspect of entanglement 
that Schrodinger had in mind in his well-known statement that 



Maximal knowledge of a total system does not necessarily include total 
knowledge of all its parts, not even when these are fully separated from each 
other and at the moment are not influencing each other at all. (jSchrodingeA 
Il935bl §10)6 

For example, with a pair of qubits in the singlet state, the joint state is pure, while the 
reduced states of the subsystems are maximally mixed. If we look at the von Neumann 
entropy as a measure of mixedness of these states, the entropy of the singlet state will 
be zero, while the entropies of each of the subsystems will be 1. This phenomenon 
couldn't obtain with the Shannon information of a pair of classical random variables, 
as H{X A y) > H{X), H{Y); and this line of thought has led to the inves t igation 



of various entropic in^ 



Cerf and Adami 



e qualities as criteria for entanglement l|Horodecki et al 



Tsallis et al 



2om) 



mm 



"Note, however, that this statement is not the most felicitous, as it is ambiguous between the thought 
that we lack total knowledge of the subsystems because there are facts to know about the individual 
susbsystems of which we are ignorant; and the — perhaps happier — thought that there simply is no 
further knowledge to be had regarding the properties of subsytems individually than is given by their 
reduced density matrix, which in the case being considered, won't be pure. 
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This aspect o f entanglement achie v ed its definitive characterization in the majoriza- 
tion criterion of iNielsen and Kempel (|2001^ . We have already met the majorization 



relation, as the underlying notion of disorder on which measures of uncertainty pro- 
vide a conventionally chosen numerical scale; and we saw that the relation applied both 
to probability distributions and to the vectors of eigenvalues of density matrices. For a 
bipartite system, therefore, we may consider how the vector of eigenvalues A(p^^) of the 
joi nt system compares to the vectors of eigenvalues of the subsystems, \{p^), X{p^). 



Nielsen and Kempd l|2001^ showed that if the state p^^ is separable, then 



A(pi2) ^ and A(pi2) ^ ^(p^). 



(5.5) 



That is, in words: if a state is separable, then it is more disordered globally than it 
is locally — the vector of eigenvalues of the joint state is majorized by the vectors of 
eigenvalues of the reduced states of the subsystems. If we then have any Schur concave 
function U that may serve as a measure of uncertainty, we will have the inequalities: 

U{X{p''))>U{X{p'j),U{Xip')). (5.6) 



For a separable state there is more uncertainty associated with the global state than with 
the states of subsystems (for all measures of uncertainty) . Contrapositively, if there is 
less uncertainty associated with the global state than there is with the states of the 
subsystems, then the global state must be entangled. 
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Importantly, 



Nielsen and Kempg l|200lD also proved that the majorization condi- 



tion is only a necessary condition for separability and not a sufficient one, as there 
exist entangled states with the same global and local spectra as separable ones — in this 
case, eqn. ()5.5|l will not be able to distinguish between entangled and separable states. 
This demonstrates the inherent limitation of the thought expressed in the quotation of 
Schrodinger above as a characterisation of entanglement^. 

5.1.3 The tetrahedron of Bell-diagonal states 

As our final piece of background it will be helpful to note that the representation of 
density operators in the form (|5.2|l becomes particularly simple if the matrix Cij happens 
to be diagonal; then we may represent the correlation matrix in the easily visualisable 
form of a vector, c, in 3 dimensional real space, g iven by the components cu. This fact 



is made use of in 



Horodecki and Horodecki 




, for example. 



In particular, if it is also the case that a = b = 0, i.e., the states of the subsystems are 
maximally mixed, then the density operator is completely characterized by the vector c. 
This corresponds to the class of Bell-diagonal states, viz. the class of states that results 
from taking convex combinations of Bell state projectors. 

For the four projectors onto the Bell states, a and b are indeed zero, as these are 
maximally entangled states; and as is well known, the c vectors corresponding to the 
Bell state projectors are: 

^It was perhaps not widely appreciated immediately that the Nielsen and Kempe result brings to an 
end at a stroke the programme of finding entropic and related criteria for entanglement e.g. using Renyi 
and Tsallis entropies. This is evident following UfBnk's characterisation of uncertainty measures based 
on the majorization relation — which includes quantities of this type — as all such criteria will be implied 
by the condition 15.51 . (Latterly there is some appreciation of this, see e.g. iR.ossignoli and Canosal 
I2OO3D .) Furthermore in light of the Nielsen and Kempe result, we know without further ado that 
criteria of this form can only be sufficient and not necessary for entanglement. 
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-4>+ 



1 ,Crf 




(5.7) 



These four vectors correspond to the vertices of a tetrahedron T centred on the origin 
(see Figure 1^5; and the Bell-diagonal states will thus correspond to vectors lying on or 
within the surfaces of T. (Furthermore, every vector c whose endpoint lies on or within 
T will correspond to a Bell-diagonal state.) 

Now consider density matrices q with a, b ^ 0, but still with diagonal. The 
tetrahedron T remains important for these states, a s no vector whose end po i nt lies 



Horodecki and Horodeckil (1996) 



beyond T can correspond to a valid density matrix, as 
note. If we were to take such a vector, c', then it would give rise to an operator that is 
not positive, for it would have a negative expectation value with one of the Bell state 
projectors. 

To see this, note that the expectation value of the projector P^- onto the singlet 
state, say, with a valid density operator, will take the form: 



P,h-Q 





/ 1 


] 






1 







1 


a 


2 







'2 


b 






) 







1 + C,/,-.C) 



(5.8) 



This expectation value has to be greater than or equal to zero if g is to be a positive 
operator; and now note that 1 + c^- .c = defines a plane normal to c^,- (and a distance 
of -l/\/3 fr om the origin), on the positive side of which the vector c has to lie, if it 
is to belong to a positive operator. This plane, of course, is the face of T normal to 
. The other faces of the tetrahedron arise in the same way, as planes beyond which 
a vector c' would belong to an operator having a negative expectation value with the 
corresponding Bell state projector. Thus in order for a diagonal matrix Cij to belong 
to a positive operator, the vector composed of its diagonal elements must lie within the 
tetrahedron T. 
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Figure 5.1: Bell-diagonal states may be represented by the diagonal components of the cor- 
relation matrix Cij. The vertices of the tetrahedron T correspond to the four Bell states 
14''^), |<^~), 14''^), A Bell-diag onal state is separable iff it corres ponds to a point belonging 

to the central octohedron T n —T iHorodecki and Horodecklll996l) . 
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Horodecki and Horodeck 



It was shown by 



based on Werner's 'swap' operation llWernei 



i _ lOQfil) (using entanglement witnesses 



) that a necessary condition for a 



density operator with diagonal correlation matrix to be separable is that the end point 
of its c vector lie on or within the octohedron given by the intersection of T with its re- 
flection through the origin, — T (Figure [^?T|l . Furthermore, for Bell-diagonal states, this 
condition is a sufficient one too, for every vector lying within the octohedron corresponds 
to a positive operator (this property in general fails for a, b 7^ 0); in particular, note 
that for a = b = the vertices of the octohedron correspond to separable states, hence 
points within the octohedron correspond to convex combinations of separable states, 
which themselves will be separable. 

Finall y, note that the entanglernent pr operties of a state are invariant under Ui (E) U2 



rotation. 



Horodecki and Horodeck: 



( 199ffl point out that with such unitary operations 
on subsystems, the correlation matrix of a joint system can always be brought into 
diagonal form® , hence the simplified 3 dimensional representation with tetrahedron and 
octohedron may be used to study the entanglement properties of all 2 ® 2 states. For all 
states with maximally mixed reduced states for subsystems, therefore, the condition of 
their correspond ing c vector belonging to the octo hedron is both necessary and sufficient 



for separability l|Horodecki and Horodecki , 



1996)- 



5.2 Characterizations in the Deutsch-Hayden repre- 
sentation 

With this background material behind us, let's begin with some very simple properties 
of the Deutsch-Hayden representation in the form (|5.2|l . 

First, recall that the requirement of unit trace and positivity imply that for a density 
operator p, Tr(p^) < 1. This further implies that 

* These unitary operations give rise to 3-d rotation matrices i?i and R2, whose effect on the correlation 
matrix {C)ij = Cij will be: fiiCRj. 
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b 


'2 


b 








\%7 



-(l + a2 + 62 + ^4)<l, 



(5.9) 



hence for valid density matrices, + 6^ + J2ij ^ij — with cquahty being achieved for 
pure states. 

For product states (p^^ — ig) p^), Q will clearly take the form: 



(5.10) 



while separable states will be of the form pscp = J2k ^kPk ^ pI, thence: 
^scp = J^^k{l'»l + a'^.cr (g) 1 + 1 (g) b'^.cr + ^ affo^^cr, g) CTj 



(5.11) 



where the are convex coefficients. Here the vector a is given by X^fe'^fc^'^) ^ 
Y,k •^fcb'=, and similarly, aj by Y,k ^ka^b'^. 

The conditions for pure state entanglement are straightforward. The joint state must 
be pure, hence + 6^ + J^ij = 3, while the expectation values for joint observables 
do not factorise: ^ aibj. If — Uibj then the state is always unentangled, but the 
converse does not hold in general unless the joint state is pure^. 

^For mixed states, factorisation might fail because of classical correlations, as in eqn. 15.111 . 
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Another way to state the necessary and sufficient condition for pure state entan- 
glement that makes more use of the geometric properties of the representation is to 
note that if g is pure, it will be entangled ijf the reduced states are mixed. In this 
representation, the simplest way to investigate purity is just to consider the length of 
vectors — pure states will be of unit length, mixed states of less than unit length. Hence 
if Q.Q = 1, then Q is entangled iff a'^,b'^ < 1. 

5.2.1 Some sufficient conditions for entanglement 

Let us now consider some sufficient conditions for entanglement when the joint state is 
mixed. 

A direct route is to consider a constraint that is implied by the correlation matrix 
for separable states being given by Cij — J^k ^k^ibj- Note that f{x) = a;^ is a convex 
function. It follows that 

k k 

and thence 

ij k k 

ij 

But (a'^)^ and {b'^)'^ are just the lengths squared of the Bloch vectors of the reduced 
density matrices of systems 1 and 2 respectively, for the fcth state making up the mixture; 
and therefore (a*^)^, (6*^)^ < 1. Therefore we obtain the constraint: 

Proposition 5 If g is separable then J^ij c^j ^ 1- 

''ij length squared of the component of the vector q in the 9 dimensional 

subspace of V;i(C^) spanned by joint observables. We learn from prop. [S] that if the 
length of this component of the vector is greater than 1, then the state is entangled. 



CHAPTER 5. ENTANGLEMENT IN DEUTSCH-HAYDEN 



142 



This squared length is invariant under unitary operations of the form Ui®U2^ hence 
the constraint is unchanged if such a transformation of q is apphed in order to diagonahze 
the correlation matrix. Following diagonalization, the constraint will be that c.c < 1 for 
separable states. The equation c.c = 1 is the unit 3-sphere enclosing the octohedron of 
separable states, making it clear that prop.Elis not a sufficient condition for separability: 
there exist entangled states with V, - c?- < 1. 

A stronger condition may be obtained by making use of the majorization criterion. 
In the Deutsch-Hayden representation, the simplest measure of disorder will be the 
length-squared measure. With this measure of disorder, the majorization criterion (|5.5(l 
implies that for separable states, the length squared of g is less than or equal to the 
length squared of the reduced state of system 1; and similarly, less than or equal to the 
length squared of the reduced state of system 2. In terms of components we will have: 



which rearrange to give the pair of constraints: 

Proposition 6 If g is separable then J^ij < 1 ± — &^|. 

Together with prop. |S1 then, this gives us a nested trio of hyperspheres in the 9 
dimensional subspace of joint observables. On the inside is the sphere given by cf^ — 
1 — |a^ — 6^1, then there is the sphere cf^ = 1; while the outermost sphere is ^ cf^ = 
1 + — If a state is separable then the component of g in the subspace of joint 
observables must lie on or within the innermost sphere; if it lies beyond it, the state is 
entangled. 




(5.12) 



< 



(5J3) 
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Of course, we know that the majorization criterion is only a necessary, and not a 
sufficient condition for scparabihty; and the apphcation made of the criterion here is 
further weakend as a result of being specialised to a particular choice of measure of 
disorder. Thus we know that props. El and IHl are in general only sufficient conditions for 
entanglement, but nevertheless they provide us with a pleasing geometrical picture of 
certain conditions under which a state will be entangled. 

5.2.2 The PPT and reduction criteria 

Let us now consider the positive partial transpose condition in the Deutsch-Hayden 
formalism. The effect of the partial transpose is straightforward. The Pauli matrices 
ax and az are invariant under transposition, so the only changes will be in the ay 
components of the state: a]: = —ay, thus: 



Now it is clear enough why positive partial transpose is a necessary condition for 
separability. If we consider first a product state, the effect of the partial transpose is to 
reflect the Bloch vector for system 2 in the z-x plane, thus we end up with a perfectly 
good state again; if our initial state is a separable state, each of the b*^ vectors over 
which we take a convex sum is similarly reflected, each again represents a perfectly good 
state, so their convex sum represents a perfectly good state. 




(5.14) 



where 




(5.15) 



CHAPTER 5. ENTANGLEMENT IN DEUTSCH-HAYDEN 144 

It would be interesting to see, however, whether studying the representation ol states 
in the Deutsch-Hayden form could provide us with insight into why, if a state is entan- 
gled, it will become non-positive under the action of the partial transpose. By investi- 
gating eqn. H5.14|l . can we learn something about why the replacement b i—s- b', Cy i-^ c[j 
leads to non-positive operators when we have entanglement? We have already seen why 
this transformation is unproblcmatic for separable states: then Cij i— > c[j can simply be 
understood as 

k k 

which is clearly consistent with the change b i— > b' and will give rise to a valid state. 
What is it about entangled states that changes this? 



Horodecki et al 



( 1996a|^ gives a com- 



Of course, in an important sense, the result of 
plete answer to this question: if a state were not to become non-positive under partial 
transposition it would not be entangled — it would be separable instead. As we shall see 
however, using the Deutsch-Hayden representation it is also possible, in some cases, to 
get a more direct answer to our question, in addition. 

We will also consider another condition, equivalent to the positive partial transpose 



cond ition for 2(g)2 and 2(8)3 dimensions, known as the reduction criterion ijHorodecki and Horodecki , 



1999|), which is based on the positive map Aiod(^) = lTr(74) — A. The effect of this map 



applied to subsystem 2 of a joint state p^^ will be: 



where is the reduced density operator of system 1. This gives rise to the following 
criterion for 2 ® 2 and 2 O 3 dimensional systems: 

Proposition 7 (Reduction criterion) p is separable <-^- (8) 1 — p^^ > 0. 

(As for the positive partial transpose, this condition will only be a necessary one for 
higher dimensions.) Clearly a similar condition may be generated with the application 
of Aj-cd to the first system instead. For qubits, Aiod is equivalent to a tt rotation about 
the z/-axis, followed by the transpose operation. 
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In the Deutsch-Hayden representation, the effect of the reduction map wiU be 



The Bloch vector for system 2 has been reflected through tlic origin, while i—f ^Cij. 
Again, can we gain any insight into why a change of this sort will lead to non-positive 
operators when we have entanglement? 

Let us begin with the simplest case. Can we explain why a pure entangled state g 
will become non-positive under partial transposition? We can, in at least two ways. 

Consider first the effect of a tt rotation about the y-axis applied to the second of our 
pair of systems in the joint state g. The result will be: 



where, as before, b' and c'^ are the system 2 Bloch vector and the correlation matrix 
for g under partial transposition respectively (cf. eqn. I|5.15|l l. The expectation value 
of the partially transposed g with this rotated state will be: 




(5.16) 




since 



b'.b' = &2 g^^j ^/ / 




(5.17) 
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It follows that 1 (8) Tq will have a negative expectation value with the rotated state 



We know from the Schmidt decomposition (eqn. 15. 3() . however, that for a pure joint 
state, the reduced states of the subsystems will be equally mixed, therefore in this case 
a? — , and the first term in H5.18|l will be zero. Furthermore, since q is pure, we know 
(eqn. HS.yi) ) that a? + b"^ + J^ij cf, — 3, while as it is entangled, a^,b^ < 1; therefore, 
c^j must be strictly greater than 1. 

It follows that for any pure entangled state q, its partial transpose will have a negative 
expectation value with a canonically chosen one dimensional projector — in fact, the one 
attained from q by the tt rotation about the y-axis on system 2 — and for this reason 
the partial transpose of such a state will not be positive. Similarly, the effect of the 
reduction map on a pure entangled state q will be to produce an operator which has 
negative expectation value with a canonically chosen projector — in this case, q itself. 

So we may see why, for pure states to begin with, if ^ cf^ > 1, the effect of partial 
transposition will be to give rise to a non-positive state; and we have seen furthermore, 
that any pure entangled state will certainly have cf^ > 1. 

This condition can be related to some of our earlier discussions. Let's consider again 
the tetrahedron and octohedron of Figure 1^31 Assume that the correlation matrix of the 
state we are interested in has been diagonalized by a suitable Ui ® U2 unitary operation, 
so that we may consider the vector c of the diagonal components alone. The effect of 
partial transposition on c will be to reflect it in the z-x plane; and we may now see 
why the end point of a c vector has to be within the central octohedron in order to be 
associated with a separable state. 



if 




(5.18) 
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Recall that if the end point of a c vector lies beyond the tetrahedron T then this 
implies a negative expectation value with the Bell projector defining the appropriate 
face. The effect of the partial transposition will be to reflect all of the vectors picking 
out points lying within T in the z-x plane; the image of T under this transformation 
will coincide with — T. If a vector c does not, after the partial transpose transformation, 
still lie within T, it will now give rise to an operator that has a negative expectation 
value with one of the Bell projectors. Therefore, it is only vectors picking out points 
within the intersection of T with its reflection in the z-x plane, that is, in the central 
octohedron, that correspond to separable states. For c vectors whose endpoints lie in 
the 'arms' of the tetrahedron, beyond the octohedron, the effect of the partial transpose 
will be to reflect them beyond T; and we see that the operators associated with these 
vectors will consequently be non-positive after partial transposition as they will have a 
negative expectation value with one of the Bell projectors. 

For diagonalized Cy , the condition J2ij ^ij > 1 translates into c.c > 1. The sphere 
c.c = 1, is recall, the sphere enclosing the central octohedron, so we see that if for a 
state Q, > 1, then the partial transpose of this state will be an operator which 

has a negative expectation value with some maximally entangled state and hence must 
be non-positive. The maximally entangled state in question will be a Bell projector 
rotated by the Ui (E) U2 operation that diagonalized for the given q. 

We have now seen two reasons why a pure entangled state q will give rise a non- 
positive operator on partial transpose. From the requirements of normalisation and 
positivity (eqn. ^^), we know that for a pure state to be entangled, cf^ > 1, i.e., 
the length of the component of the state in the joint observable susbspace is greater 
than one. This means i) that the partial transpose of g will have a negative expectation 
value with the 1 ® (Xy rotated g ; and ii) that the partial transpose of g will also have 
a negative expectation value with a maximally entangled state. These give us direct 
reasons why these entangled states will have a non-positive partial transpose. 
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In fact the considerations that lead to property (ii) do not depend essentially on 
Q being pure. We know from prop. that if '^fj > 1j then the associated state, 
whether pure or mixed, is entangled; and from the reasoning we have just rehearsed 
concerning the tetrahedron, we may infer that any such state will certainly give rise to a 
non-positive operator under partial transposition, as it will have a negative expectation 
value with a maximally entangled state. 

The reasoning that led to property (i) may also be generalised to cover a large class 
of mixed entangled states. In eqn. H5.18|l . assuming the purity of g meant that the first 
term became equal to zero; this will not be so in general with mixed states. The other 
role of the purity assumption was in allowing us to pick out a projector with which the 
partially transposed operator would have a negative expectation value. If g is not a 
pure state then the 1 (g) rotated g will obviously not be a projector, though. 

We may still use eqn. (|5.18() to understand why some entangled states will give rise to 
non-positive operators on partial transposition, or following application of the reduction 
map, however. When g is mixed, (|5.18() is indicating a simple condition under which 
the partially transposed g becomes an entanglement witness. If for a given g, 

ij 

then li^Tg is an entanglement witness for the state resulting from applying the unitary 
operation 1 (X) ctj, to g. The partially transposed operator will have positive expectation 
value for all separable states (this property of the original density operator is not changed 
under partial transpose) and a negative expectation value for the stated entangled state. 
Similarly, when H5.18|l holds, 1 (E) Arcd Q will be an entanglement witness for g itself. We 
know, finally, from the majorization criterion applied with the length squared measure 
of disorder, prop. El that whenever (|5.18(l holds, then g will be entangled. 
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It is an interesting fact that the sufficient condition for entanglement derived from 
the majorization condition should coincide with the condition under which the partially 
transposed g can be seen immediately to be an entanglement witness for a particular 
entangled state. To re-cap, the insight provided by this into why some entangled states 
become non-positive under partial transposition is that any g for which H5.18|l holds will 
be entangled; and we can see it will certainly end up not being positive under partial 
transposition, as it will be transformed into an entanglement witness for a particular 
known state, hence will be an operator with a negative expectation value for that state, 
hence will not be a positive operator. 

5.3 Summary 

In this chapter we have gained some insight into when a state will be entangled in 
the Deutsch-Hayden formalism. For pure states, the necessary and sufficient condition 
will be that cf^ > 1; that is, the component of the vector g in the 9 dimensional 
subspace of joint observables is of length greater than 1. For mixed states, this weakens 
to a sufficient condition. By applying the majorization condition, a generally wider 
sufficient condition is found: if the component of g in the joint observable subspace lies 
beyond the hypersphere given by J^ij cfj = 1— \a^ — b'^\ then the state is entangled. We 
were also able to gain some further understanding of why certain classes of entangled 
states will become non-positive under the effect of the partial transpose operation. A 
pure entangled state, we have said, will have cf^ > 1, but then such a state can be 
seen to become non-positive under partial transpose in at least two direct ways: it will 
have a negative expectation value with a certain canonically chosen one-dimensional 
projector; and it will have a negative expectation value with a maximally entangled 
state attained from a Bell state by the Ui (g) U2 rotation that diagonalizes the correlation 
matrix. This analysis can be extended to some mixed states. Those mixed states with 
'^ij ^ ^^^^ entangled and can also be seen to become non-positive under partial 
transposition due to a negative expectation value with a maximally entangled state. 
Finally, those mixed states with J^ij cfj > 1 — (fe^ — a^) will be entangled, and they 
become non-positive as they are transformed into entanglement witnesses. 
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We have been concerned in this chapter exclusively with bipartite entanglement. 
However it would be interesting to extend the discussion to multipartite cases. As 
Hewitt- Horsman has noted^°, the Deutsch-Hayden formalism is useful in the context of 
tripartite pure state entanglement for determining the presence of genuine three-party 
entanglement, as opposed to two-party. One begins by checking conditions of the form 
{<li.iQ2.j<l3,k) 7^ {qi,i){<l2,j){q3.k)- If tlicse are satisfied, then as the joint state is pure, 
there must be some entanglement present. The next stage is to check the following three 
sets of conditions: 



('7i.»«2jg3,fe) 7^ 



{qi,i){q2,jq3,k), 
{q2,j){qi^tq3.t), 
{qi,iq2,]){q3,k)- 



There are then two possibilities. Either all three (sets of) conditions are satisfied, in 
which case we have genuine 3 party entaglement, or two are satisfied and one is not, 
in which case we have 2-party entanglement (and depending on which two are satisfied, 
this will tell us which two systems are entangled.) 

Within the class of genuine three-party entangled states, there is a distinction be- 



tween GHZ-type entanglement l|Greenberger et al 



1989 1. states of the form: 



|V) = ^(|000) + |111)); 



and W-type entanglement l|Diir et al 



200(11 . states of the form: 



V3 



(|001) + |010) + |100)). 



It is plausible to conjecture that an extension of the approach used in this chapter will 
prove useful for distinguishing between states of these two different classes, as tracing 
out one system of a GHZ state leaves a separable (classically correlated) state, while 
tracing out one system of a W state leaves a component of maximally entangled two- 
party entanglement. This fact should find expression in the Deutsch-Hayden formalism 
in terms of differing lengths of components in the subspace belonging to two-party joint 
observables. 

^''Talk at Oxford Philosophy of Physics Discussion Group, 12 February 2003. 



Chapter 6 

Quantum Computation and 
the Church- Turing Hypothesis 

6.1 Introduction 

In this chapter we wiU be considering some of the philosophical questions raised by the 
theory of quantum computing. First, and briefly, whether the efficiency of quantum 
computing gives us an argument for a substantive notion of quantum information (Sec- 
tion a-iid second, in more detail, we shall consider some questions regarding the 
status of the Church- Turing hypothesis f Sect ions 16.31 and 16 . 4|l . 

The advent of quantum computers has raised a question concerning the relationship 
between the classical theory of computation, based on the Church- Turing hypothesis, 
and the quantum theory. It is quite common to find the claim that the quantum theory 
of computation is the more fundamental. However, one sometimes also encounters a 
much stronger claim to the effect that the quantum computer has succeeded in finally 
making sense of Turing's theory of computation, or that Turing's machines were really 
quantum mechanical all along. We shall be considering some of the issues that have 
arisen around this question of the relation between the classical and quantum theories 
of computation. 

Richard Feynman was the prophet of quantum computation. He pointed out that it 
seems that one cannot simulate the evolution of a quantum mechanical system efficiently 
on a classical computer. He took this to imply that there might be computational 



151 
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benefits to be gained if computations are carried out using quantum systems themselves 
rather than class ical systems; and he went on to describe a universal quantum simulator 



I Fevnman . 



1982^ . However, it is with Deutsch's introduction of the conce pt of the 



unive rsal quantum computer in his 1985 paper that the field really begins l|Deiitsch . 

Deutsch's paper is the seed from which the riches of quantum computation theory 
have grown, but in it are to be found roots of philosophical confusion over the notion of 
computation, in particular, in the claim that a physical principle, the Turing Principle, 
underlies the Church- Turing hypothesis. 

The Turing Principle is stated as follows: 

Every finitely realizable physical system can be perfectly simul ated by a 
universal model computing machine operating by finite means. l|DeiitschL 
[1985) 

It is the claim that the Turing Principle underlies the Church- Turing hypothesis that 
is primarily responsible for the thought that quantum computers are necessary to make 
proper sense of Turing's theory. For the Turing Principle is not satisfied in classical 
physics, owing t o the con t inuity of states and dynamics in the classical case, yet it is. 



Deutsch argues IjDeutsch . 




§3) in the case of quantum mechanics. If the Turing 
Principle really were the heart of the theory of computation, prior to the development of 
the notion of quantum computers we would have been faced with a considerable difhculty, 
as this supposedly fundamental Principle is false under classical mechanics. I shall be 
arguing, however, that it is a mistake to see the Turing Principle as underlying the 
Church- Turing hypothesis f Section 16. 3|l . hence this issue does not arise. In Section 
we will consider whether the Church- Turing hypothesis might play a r ole as a constraint 



on physical laws, as suggested in the quantum case by 



Nielser 



p lay a r c 

2m^, 



for example. 
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6.2 Quantum computation and containing informa- 
tion 



Before moving on to discuss the Church- Turing hypothesis and the Turing Principle, 
let us pause to consider briefly an argument suggesting that quantum systems should 
be seen to contain information in a more literal, or substantive, sense than I have so 
far allowed. This argument is based on the gains in efficiency over the best known 
classical algorithms that can be achieved for certain important computational tasks 



using 



(2000) 



quantum computers. The argument is suggested by the presentation of 



Jozsa 



It is very natural (although not wholly uncontroversial) to view the property of 
entanglement as the m ain source o f the exponential speed-up given by quantum algo- 



rithms such as that of 



Jozsa and LindeE 



I.To7,saLll998t 



Ekert and Jozsa . 



■Tozsa 



200311 . This view can be motivated in the following way. If we con- 
sider specifying the state of a system composed of n two state classical systems, then n 
bits are needed. By contrast, in order to specify a general state of an n qubit system, 
we will need to specify 2" coefficients for the 2" basis vectors of the system (because of 
the tensor product structure of the state space); the order of the number of bits needed 
will be exponential in n. It is often therefore said that '...a quan tum system can embody 
exponentially more information than its classical counterpart' jjozsal I2OO0I p. 108). 

Now when we consider information processing, i.e., evolving the quantum state in a 
particular way, then even the simple case of a single 1-qubit operation (a single compu- 
tational step for a quantum computer) is equivalent to an exponentially large amount 
of classical computation, when the initial state is entangled. The effect of the unitary 
operation on the state would need to be calculated classically as a (2 x 2) matrix mul- 
tiplication for each of the 2" coefScients specifying the state. The quantum evolution 
corresponds to exponentially much classical computation, in the presence of entangle- 
ment: 

Natural quantum physical evolution may be thought of as the processing 
of quantum information.... [T]o perform natural quantum physical evolution. 
Nature must process vast amounts of informa tion at a rat e that cannot be 
matched in real time by any classical means... l|jozsai l200d p.109) 
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There is a strong suggestion that quantum evolution is doing a great deal of work — a 
great deal of work in processing something — and therefore, there is something a great 
deal of which is being processed: we should allow a more substantive notion of quantum 
information. 

This conclusion can be resisted by noting that we have here a further example of 
what 1 have termed the simulation fallacy (cf. Section l3.4.1() . The fact that quantmn 
evolution corresponds to an exponentially large amount of classical computation implies 
that we can use quantum systems to do something that corresponds to a very great deal 
of work in classical terms. But we cannot infer from this that the quantum computer 
is doing this amount of work, rather than merely causing, in a different way, a result 
which could only be brought about with a lot of effort, classically. 



6.3 The Turing Principle versus the Church- Turing 
Hypothesis 

Let us now turn to consider the Turing Principle. In his landmark 1985 paper, Deutsch 
argues that underlying the Church- Turing hypothesis, the basis for the classical theory 
of computation, there is an implicit physical assumption, namely, the Turing Principle, 
which is, recall: 

Every finitely realizable physical system can be perfectly simul ated by a 
universal model computing machine operating by finite means. "'^ ijDeutschl 
[l985.) 

The Church- Turing hypothesis, by contrast, he states as follows: 

Every 'function which would naturally be r egarded as coni putable' can be 
computed by the universal Turing machine. l)DeutschLll985(l 

The two main ways in which these statements differ are, first, that Turing's 'functions 
which would naturally be regarded as computable' has, in effect, b een replaced by 'func- 



tions which may in principle be computed by a physical system' l|Deiitsch , 



ma p.99). 



computing machine M is said to perfectly simulate a physical system S, under a given labelling 
of their inputs and outputs, if their exists a program it{S) for M that renders M computationally 
equivalent to S under that labelling. 
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the result of the stipulation that the universal computing machine perfectly simulates 
every finite physical system; and second, that the reference to a specific form of universal 
computer — the universal Turing machine — has been replaced by an unspecified universal 
computing machine, with the requirement only that it operate by finite means. 

The heuristic value of the move to the Turing Principle is undoubted, for it led 
Deutsch to define the universal quantum computer and hence spark a vigorous new 
field of physics. The psychological liberalisation involved in this move from the Church- 
Turing hypothesis was thus invaluable, but, I shall suggest, it is mistaken to argue 
that the Turing Principle underlies the Church- Turing hypothesis, or that this physical 
principle should be thought of as the real basis for the theory of computation. 

To beg in with, it is im portant to recognise that in his famous paper 'On Computable 



Numbers', 



Turing! ljl93(i() was concerned with what is computable by humans, not with 



describing the ultimat e limits of what we now mean by 'computer'. Deutsch is well 



aware of this fact, e.g. 



Deutsch et al 



I I999I p.2), but by glossing over it here, we would 



miss several important things. First, the purely mathematical element of Turing's thesis; 
second, the chance to separate out the precursors of the computational analogy from 
the foundations of the theory of computation^; and third, the distinction between the 
task of characterizing the effectively calculable, which had become so urgent by the 
mid 1930's and to which the Church- Turing hypothesis was directed, and the rather 
different project of considering what classes of functions can be calculated by machines 



or physical processes most widel y 



has emphasized e.g. 



1 



const rued (a distinction which Copeland, in particular. 



Copelandl (|2000|)). To see something of the significance of these 
points, let us make the comparison with Church's position in his 1936 paper. 

Church proposed that the intuitive notion of effective calculability be m ade pre 



cise b y identifying effectively calculable functions with the recursive functions l|Churd] , 
§7). Again, calculability here means calculable by humans. By contrast, Turing 
presented the mathematical insight that if certain functions could be encoded in, for 
example, binary terms, then a machine could be made to compute analogues of those 
functions. The machine was the Turing machine and it turned out, the functions the 

HShanke^ ^1987^ investigates this area and undertakes this separation in detail. 
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recursive functions. The second part of his argument, §9 of the paper, was then to 
relate this to human calculation; an argument for why computability defined in terms 
of Turing machines should capture all that would 'naturally be regarded as computable' 
by humans. 



19871 §2), the differences between 



As Shanker, for example, recounts ijShanke 
Church's and Turing's presentations was all important for Godel. Godcl did not ac- 
cept what is best seen as Church's stipulation that the effectively calculable functions 
are the recursive functions until Turing's argument in 'On Computable Numbers' became 
known. His objection was that Church had not shown why the properties associated 
with our intuitive notion of effective c a lculability w ould be captured by the class of re- 



cursive functions (see also 



Davis 



Soarf 



19961) ). That he came to accept Church's 



convention after 'On Computable Numbers' shows that he took Turing to have solved 
this problem. Presumably, what was important about this solution was not Turing's 
demonstration of the capabilities of the Turing machine, but rather, the argument in §9 
that Turing machine computability captures that which would 'naturally be regarded as 
computable'. Thus Godel was convinced of the adequacy of Turing's account of what it 
is for a human to calculate in a formal system; and that this was no different from the 
operation of a Turing machine. In this way Turing was supposed to have explicated the 
intuitive notion of effective calculability. 

However, we should note that it is precisely this step back to the notion of calculable- 
by-human from calculable-by-machine and attempting to explain the former in terms 
of the latter that gives rise to the computational analogy, which may well be seen as 
philosophically problematic^; and note further that this final 'step back' is an entirely 
logically separable part of the argument. The class of functions that may be calculated 
algorithmically by a human computer may be co-extensive with the class of Turing 
machine computable functions, without one having to explain human computation in 
mechanical terms. 

To continue: If we were to follow Deutsch and reinterpret Turing's 'functions which 
would naturally be regarded as computable' as the functions which may in principle 

^ Shanker 1 1987) locates the ultimate source of the pressures that lead here to the computational 
analogy (as he calls it, the Mechanist Thesis), with Hilbert. 
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be computed by a real physical system, then we are neglecting the fact that Turing 
meant computable by humans. This is no mere historical point. The most obvious 
consequence would be that we ignore the possibility of making the useful distinction 
between computing by human and computing by machine — a physical system considered 
as a computer. But perhaps more importantly, we miss the significance of Turing's purely 
mathematical thesis, his recognition that certain functions can be encoded and machines 
thus made to compute them for us. Deutsch's argument for his reinterpretation is that 

...it would surely be hard to regard a function 'natu rally' as computable if it 
could not be computed in Nature, and conversely. l|DeutschLll985. p.99) 

In the first part of this, 'computed in Nature' suffers from the suggestive ambiguity 
between computable by human and computable by physical object, so let us take it to 
mean computable by machine, or more widely, physical object considered as a computer. 
More important for the present is the converse, which would read: 

It would be hard to regard a function computable in Nature as not 'naturally' 
computable. 

But this is rather a teasing play on words. Part of the point at issue is what it means for 
a function to be computable in Nature, for a function to be computed by a machine, a 
meaning that Turing had to provide en route to determining what the relation between 
functions computable in Nature and the 'naturally computable' might be. If we just 
claim that the 'naturally computable' functions are all and only those functions that 
can be computed in physical reality, we not only, perforce, miss the original point of 
trying to capture the effectively calculable, but more importantly for present purposes, 
we miss out the key mathematical component at the heart of the theory of computation. 
For we have not provided, as Turing did, a specification of what it is for a physical object 
to compute, to give a mathematical meaning to the possible evolutions of physical states. 

What can be computed in physical reality has two sorts of determinant, mathematical 
and physical. The mathematical determines what the evolution of given physical states 
into others in a certain way would mean, what would have been computed by such a 
process; and the physical determines whether such a process can occur. Identifying the 
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'naturally computable' functions with those that can be computed by physical systems, 
we emphasize the physical determinant to the exclusion of the mathematical one — we 
say that what can be computed is whatever can be computed by any physical system, 
but we have not said what, if anything, these various physical processes amount to in 
mathematical terms. 

When Deutsch sa ys that behind th e Ch urch- Turing hypothesi s is really an assertion of 



the Turing Principle l|Deutschl l| 198,4 p.99) 



Deiitsch et al 



1 1999[ p. 3)), what he is trying 



to capture is the imperious nature of the hypothesis: you can't find any computation that 
can be done that can't be done by the universal Turing machine. He takes this imperious 
claim to require the possible existence of a physical object that could actually perform 
every (physical) computation. For '...the co mputing power of abstract machines has no 



bearing on what is computable in reality' IjDeutsch , 



19971 p. 134), what is important 



is whether the computational processes that the machine describes can actually occur. 
The essence of the universal computing machine is supposed to be that the physical 
properties it possesses are the most general computational properties that any object 
can possess. It follows that if the universal machine is to be an interesting object of 
study, it must be physically possible for it to exist (although supplies of energy and 
memory may remain a little idealised), otherwise studying it could tell us nothing about 
what can be computed in reality. 

The significance of the Turing machine is thus supposed to lie in the fact that its 
description is so general that it has been pared down to the bare essentials of computing, 
with the result that any computation by any object can be described in terms of the 
operation of a Turing machine^. Deutsch considers Turing's machine to be a very good, 
but ultimately in adequate attem pt to give a description of the most general computing 



machine possible (jPeutsch , 



19971 p. 252). He would suggest that Turing had made himself 
hostage to fortune by offering such a concrete characterisation of what is supposed 
to be the most general computing machine, in particular by explicitly describing the 
machine in classical (mechanical) terms and not allowing for the possible implications of 

^This is perhaps a common view of the significance of Turing's machine. 
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quantum mechanics or some other successor theory^. Taking Turing's intention to refer 
to the most general machine as the important thing and erasing the unnecessary physical 
details of the Turing machine^, the content of the Church- Turing hypothesis becomes 
the assertion that this most general machine can exist. The hypothesis has become the 
physical Principle — it is now just an empirical question whether the universal computing 
machine can exist. 

But this misrepresents the import of the Church- Turing hypothesis, for we have 
missed the mathematical component, the definitional role of the Turing machine in the 
theory of computation. Put baldly, the reason why there is no computation that cannot 
be performed by the universal computing machine is not that it just so happens that this 
object can actually exist in physical reality, but rather that nothing could count as such a 
computation. A computation is defined by reference to the abstract universal computing 
machine, the possible evolution of physical states given a mathematical meaning by 
reference to that model. What we call a computation is determined by the abstract 
model, hence there can be no such thing as a computation that cannot be performed by 
the universal computing machine. 

Of course, it is conceivable that there could be physical processes that are not covered 
by our abstract model and which we decide we might want to call computations, but 
these processes still need to gain a mathematical meaning from somewhere; and once 
we have given them such meaning, we will have extended our definition of computing 
to cover these cases as well.'' This does not, however, affect the point that by definition 
there can be no computation that cannot be performed by the universal computer, 
as a corollary of these physical processes having mathematical meaning. (Until these 
processes are accepted under the definition, they are not yet computations. Compare 
p. 11641 for an example of a specific type.) 

^Deu tsch citeSj for e xample, Feynman's remark a propos Turing: 'He thought that he understood 
paper.' jDeutschl 119971 p.252) 

®The essence of the Turing machine is retained in the requirement that the universal computing 
machine operate hy finite means, defined in Deutsch (1985, p. 100). 

'^Note that this question differs from the question of whether the definition of machines computing 
captures all that would 'naturally be regarded as computable' by humans. What is currently at issue is 
the mathematical meaning that can be given to various physical processes, not whether the definition of 
computing offered would include all and only that which falls under the intuitive notion of the effectively 
calculable. 
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Having noted that from two computing machines we can form a composite machine, 
whose set of computable functions contains the union of the sets of functions computable 
by its components, Deutsch suggests that: 

There is no purely logical reason why we could not go on ad infinitum build- 
ing more powerful computing machines, nor why there should exist any func- 
tion that is out side the computable set of every physically possible machine. 
dneutschLlTflS^ p.98) 

He goes on to suggest that it is physics rather than logic that provides the constraint 
(presumably the contingent physical fact that there can exist a universal computing 
machine exhausting the possibilities). But this seems wrong. Our immediate response 
is to ask why might there not simply come a point after which no new functions are 
added and we would just keep adding ones we already have? What might determine this? 
It is precisely logic, logic and mathematics, that determine this question. Once we have 
defined our computational states in a certain way, it is mathematics that determines the 
set of functions that can be computed by all possible evolutions of those states. Deutsch 
is correct that physics has a role to play in determining what is computable, but it can 
only get in on the act after mathematics. 

Another example of Deutsch seeming to over-emphasize the role of physics at the 
expense of mathematics is the following passage: 

Computers are physical objects, and computations are physical processes. 
What computers can or cannot comp ute is determine d by the laws of physics 
alone and not by pure mathematics. llDeutschLll997i p.98) 

Computations, remembering that we are speaking strictly of mechanical, not human 
computers, are indeed physical processes, but what makes them a computation is not 
physical. The processes going on in a computer are governed by the laws of physics, 
but it would be wrong to say that the computation is entirely governed by physics, 
for mathematics determines what the transitions from physical state to physical state 
mean. Physics determines what physical state can follow from what physical state, 
but mathematics determines whether or not this is a computation and what it is a 
computation of. 
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Deutsch is quite right to emphasize that the physical determinants of computing 
should not be ignored in the theory of computation, but he has taken this insight too 
far by entirely neglecting the mathematical determinants. We must recognise that their 
place is prior to that of the physical determinants. If our theory of computation is 
asking what the ultimate limits of computation are (again, computation by machines), 
the answer must involve two sorts of consideration. We are asking what is possible with 
physical computational states defined in a given way, so our first consideration is what 
can these states evolve to, whilst the second is: what does such an evolution mean? The 
first part is a physical question and the second part mathematical. Maths will determine 
what ai . . . af means (a being physical states under some description) , but it won't say 
if it is a possible evolution of states — that is for the laws of physics to decide. 

We might want to say, then, that mathematics provides the ultimate bound on what 
is computable (most obviously, nothing could count as computing a contradiction); and 
it determines what progressions of physical states are computations and what they are 
computations of. But what progressions of physical states there can be is determined 
by physics. 



In their admirably clear 1999 paper, Deutsch, Ekert and Lupacchini ijPeutsch et al 



1999f) admit that there are both logical and physical limits to the computations that can 
be performed by computing machines. They present the halting problem as an example 
in which logical and physical constraints are intimately linked; but their discussion 
still seems to betray confusion between the mathematical and the physical nature of 
computing. 

From the halting problem, we learn that there are some computational problems, in 
particular, determining whether a specified universal Turing machine given a specified 
input will halt, that cannot be solved by any Turing machine; and it is logic that tells 
us this. Deutsch, Ekert and Lupacchini go on to say that: 



In physical terms, this statement says that machines with certain properties 
cannot be physically built, and as such can be viewed as a s tatement about 
physi cal reality or equivalently, about the laws of physics. l|Deiitsch et all 
Il999t p.4) 
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But the halting problem tells us nothing of the sort. The halting problem lies primarily 
on the mathematical side of computing and teaches us nothing directly about the laws of 
physics. Given the specification of computing states we are dealing with, mathematics 
and logic tell us that nothing could count as providing the solution to this problem; no 
possible state is the solution. Thus the 'certain properties' that the machines may not 
possess are mathematical properties, not physical ones. It is not that the machines are 
forbidden to possess these properties, that some force prevents it, it is that nothing would 
count as building a machine with these properties. The halting problem, then, tells us 
nothing about what can be built; it tells us the mathematical constraints on what can 
be computed given the way we have defined computing. Failing to recognise this means 
failure to understand the way in which the definitional role of the abstract universal 
computer gives mathematical meaning to the evolution of physical states. This in turn 
can be traced back to a failure to recognise Turing's purely mathematical achievement in 
'On Computable Numbers', quite separate from the concern there with epistemological 
issues surrounding effective calculability.^ 

We have seen that Deutsch's emphasis on the possible physical existence of the 
universal computing machine misrepresents its significance; missing entirely its essential 
role determining the mathematical meaning of the evolution of physical states. From 
this it is clear that insisting on the physical nature of the Turing Principle debars it 
from playing the central role in the theory of computation. For it is not a contingent, 
empirical fact that there exists a universal computing machine, it is a necessary fact that 
arises from the way the abstract model determines the mathematical meaning of certain 
physical processes, making them computations. It is not that the universal machine 
covers all the possibilities, the universal machine determines the possibilities. 

Where Deutscli is correct, however, is that there is a clear sense in which we should be 
interested in the physical realization of the abstract computing machine. The importance 
of being able to build the machine, if only in principle, is that we want the progressions of 
states it describes to actually be do-able! This would clearly determine whether we have 
an interesting definition of computation and one worth pursuing. (We should emphasize 

®I am indebted to|§hankcr 1 198^ for the emphasis on this separation. 
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here, just to be clear, that this issue is distinct from the issue of the definition of 
computation providing mathematical meaning for physical processes and distinct again 
from the task of characterizing the effectively calculable.) However, we do not require 
that the universal computing machine be a possible physical existent; all that is required 
is that physical analogues of the computational processes of the universal machine are 
physically possible processes. 



6.3.1 Non- Turing computability? The example of Malament- 
Hogarth spacetimes 

As a particularly striking example of where these concerns would be relevant, let us con- 



sider Hogarth's 



times (jHogarth . 



prese ntation of non- Turing computability in certain rclativistic space- 



122^- The idea is that in these spacetimes, dubbed Malament-Hogarth 
spacetimes, it appears possible to perform supertasks — an infinite number of steps in a 
finite length of time. These spacetimes (M, g) are such that they contain a path A that 
starts from a point p and has infinite length, but that on this path it is always possible 
to signal to a point q that can be reached from p in a finite span of proper ti me.^ 



Earman and Norton 



IQQI^^ . Starting 



A toy example of such a spacetime is given by 
with a Minkowski spacetime {R*,ri) we choose a scalar field il on M such that il = 1 
outside a compact set C C M and il tends rapidly to infinity as we approach a point 
r E C. The spacetime (i?"* — r, fl^rj) is then a Malament-Hogarth spacetime and the 
path A will start at p and go towards r. What we are supposed to do is project a 
given Turing machine down the path A and then travel to q, by which time the machine 
will have signalled to us if it has halted. Using this technique, we might, for example, 
solve the Goldbach conjecture by programming our Turing machine to check each even 
number in turn to determine whether it is the sum of two primes, and halt if it finds 
a counterexample. We then send it off down A and travel to q. If we have received a 
signal, the conjecture is false, if not, it is true. Generalising this approach, we appear 
able to solve Turing unsolvable problems in these spacetimes. 

^That is, all points on A are contained in the chronological past of q. The chronological past of a 
point < 7 is the set o f all points p for which there is a nontrivial future directed timelike curve from p to 
q tEarman and Norton. .199 j . p. 24, n.l). 
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The decision problem for a property P is said to be solvable if there is a mechanical 
test (effective procedure) which will tell us wheth er or not any object (of th e appropriate 



1974 p.115). Thus 



category) possesses P in a finite number of steps IjBoolos and Jeffrevi 
the decision problem for P is Turing solvable if there is both a Turing machine that will 
halt after a finite number of steps if and only if P holds and a Turing machine that will 
halt after a finite number of steps if and only if P does not hold. If only one of these 
exist, the problem is partially Turing solvable. The halting problem and the decision 
problem for first order logic are partially Turing solvable, but the full decision problem 
can be solved for them in a Malament-Hogarth spacetime. For the halting problem, all 
we need do is project the Turing machine in question down A, set to signal if it halts. 
We travel to q and if we have received a signal, we know the machine halts and if not, 
we know it never halts. Similarly for the decision problem for first order logic, noting 
that there exists a Turing machine that will h alt after a finite number of steps if a given 



sentence S is valid IjBoolos and Jeffrey 



1974 p. 145), we adopt the same procedure — if 
we have a received a signal at q, the sentence is valid, if we have not, it is not. It is 
clear that the decision problem for any partially Turing solvable problem is solvable in a 
Malament-Hogarth spacetime (we will have to vary our interpretation of signal/no-signal 
appropriately, of course). 

Hogarth goes on to describe more complicated computational processes that would 
seem to solve the decision problem for arithmetic, but the simple case serves for our 
purposes. We have here a clear example of the question of the physical realizability 
of the processes described being all- important. If the processes Hogarth describes are 
physically possible, then we have a whole new class of computability distinct from Tur- 
ing computability and we extend our notion of computability accordingly. Note that the 
mathematical meaning of the processes Hogarth describes piggy-backs on our current 
definition of computability — we think we can see clearly what these processes would 
mean if they were physically possible. Given the meaning we have already given to com- 
putational processes in terms of the universal Turing machine and what it can compute, 
these meanings seem to follow.^" The reason why the claim that it is a conceptual truth 

^"l say 'think' and 'seem' here, for we may believe that these mathematical meanings unfold from, 
since they are already contained in, the mathematical concepts we have. But we may believe that the 
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that our particular universal computing machine can perform all possible computations 
is not undermined by the Hogarth example and others like it is that we have, as it 
were, recognised new possibilities in our (abstract) universal computing machine, not 
discovered that it could not in fact perform all possible computations, which would be 
logically impossible. Or rather, to be more precise, by generalising or slightly adjusting 
the sets of physical states and their evolution for our definitional universal machine (in 
the Hogarth case, by including evolutions in these unusual spacetimes), we change the 
class of computations and computable functions at the same time. 

Returning to the question of the physical realizability of these Hogarthian processes, 
we need to recognise that the computational process extends from the initial launch 
of the Turing machine to the possible reception of the signal by the receiver. Thus 
whether these are physically possible computations will depend on whether a suitable 
Turing machine can exist in the spacetime in question (in particular we will be worried 
about what happens to it as it approaches r) , whether a signal from the Turing machine 
can reach the observer intact, and of course, whether Malament-Hogarth spacetimes are 
physically possible. If it turns out that these processes are physically possible, then we 
must extend our notion of what can be computed to include these striking non- Turing 
computations. If they are not, then a definition of computability that included Hoga- 
rth's computations would not be an interesting one for practical purposes — it would be 
no more than a mathematical toy. We cannot learn any maths from the conceivability of 
peculiar computational processes, for our knowledge of the relevant maths is already ex- 
plicit in our conceiving them; that it might be an open question whether these processes 
are physically possible is only relevant to the question of what we can make machines 
(or physical objects in general since 'machine' implies manufacturing), do for us. 

mathematical meaning of these processes ultimately rests on our decision to accept the conclusions set 
out as following from our present stock of mathematical propositions. This allows for the positions of 
those who believe there is a fact about, for example, whether Golbach's conjecture is true independent 
of whether a proof or disproof has been or ever will bo found; and those who believe there is no such 

fact until a proof or disproof has been found. 

'^^These questions should be app roached with an open mind, see lEarman and Norton! il99.?f> for an 
interesting discussion, and compare iHogartU il994l §6) 
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6.3.2 Lessons 



We have seen, then, that Deutsch has over-emphasized the physical determinants ol 
computing to the exclusion of the mathematical. The Turing Principle should not be 
seen to underlie the Church- Turing hypothesis, for that misrepresents the mathematical 
significance of the concept of the universal computing machine. The universal machine 
defines the mathematical meaning of the possible evolution of physical states and hence 
it is a necessary fact that the universal computing machine can perform every possible 
computation. It is certainly interesting that the Turing Principle happens to be true 
in quantum mechanics^^, but we should hesitate to draw any far-reaching conclusions 
from this. Certainly, the claim adumbrated in Section lfi.ll that the advent of the quan- 
tum computer makes sense of Turing's theory of computation, that his machines were 
quantum mechanical after all, is false. 

The discussion might be summarized in the following way. 

It is useful to distinguish between three different tasks with which the Church- Turing 
hypothesis is associated: characterizing the effectively calculable, providing the evolution 
of physical states with mathematical meaning and fixing upon a useful definition of 
physical computability. The Turing Principle could not replace or underlie the Church- 
Turing hypothesis for any of these tasks. Not the first, because the Turing Principle 
is supposed to concern all functions computable by physical systems, rather than what 
is computable by a human; and not the second or third because an empirical principle 
cannot play the crucial definitional mathematical role that I have emphasized. It is 
perhaps worth noting that the Turing Principle is undoubtedly most closely tied in 
intention to the third of these tasks rather than to the first. However, although it 
is true that Turing did not consider the possibility of computations using explicitly 
quantum objects, this can hardly be said to be to the detriment of the Church- Turing 
hypothesis. The third of the tasks I have mentioned, delimiting the bounds of physical 

^^Intuitively, the state of any finite quantum system is just a vector in Hilbert space and can be 
represented to arbitrary precision by a finite number of qubits; and any evolution of the system is 
just a unitary transformation of this vector and can be simulated by the universal quantum computer, 
which by definition can generate any unitary transformation with arbitrary precision. Deutsch offers a 
more rigorou s proof taking i nto account the fact that any sub-system must always be coupled to the 
environment iPeutschl . llflSfil §3) 
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computability, is not really, after all, the object of the Church- Turing hypothesis. 

As has been emphasized at various points, we have been talking in this section only 
of computation by machine or by physical object considered as a computer, as opposed 
to human computing or calculating. This is an important clarifying step that allows 
us to distinguish clearly the mathematical and physical sides of the theory of compu- 
tation. Having mentioned this convenient separation of human from machine, however, 
one's thoughts seem naturally drawn to the further, notoriously vexed, question of the 
relation between human cognition and machine computation. Rather than delve into 
this question here^'^, it suffices to note that even if it is thought that human calcula- 
tion is no more than physical calculation with a cherry on top, this separation remains 
important, for it emphasizes the different types of role the mathematical and physical 
determinants of computation play; and this distinction in role is one which, I suggest, 
should be retained independently of any judgement on the value of the computational 
analogy. 



6.4 The Church-Turing Hypothesis as a constraint on 
physics? 

In the preceding section we saw the necessity of distinguishing between a number of 
different ideas with which the Church- Turing hypothesis is often loosely associated; 
and it was emphasized in several places that the task of characterizing the effectively 
calculable functions should be distinguished from the task of delimiting the bounds 
of the physically computable, while it is the former task to which the Chu rch- Turin; 
hypothesis i s directe d. This importa nt point has been ably expounded by 



200a.l2002|) (see also 



Pitowskvl l|2002|) ') 



Copelan 



On this topic, a telling observation concerns the nature of the evidence that is cited 



^^See TiniDson (2004, §4) for discussion of this question. One point that it is perhaps helpful to note 
is that the debate about the nature of human cognition and of thinking machines might generate less 
heat and confusion if the question of whether it might be possible to build a machine which we could 
appropriately ascribe mental conduct terms to were always clearly distinguished from the question of 
whether it is possible to analyse cognition and conation in computational terms. 

^''For an intemperate re ply to Copeland, in defence of th e 'orthodoxy' which conflates these and other 
ideas, see . Hodg es 12003). ICopeland and Proudfoo^ i2004 §5) reply. 
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as endowing the Church- Turing hypothesis with the very high degree of entrenchment 
that it deservedly enjoys. This evidence generally centres on the fact that the large 
number of different attempts to make precise the intuitive notion of effective calcula- 
bility all give rise to the very same class of computable functions, along with the fact 
that all the functions we intuitively ta ke to be effectiv ely calculable fall into this class. 



A representative textbook statement l|Cutla: 



|n^ 



198(t is the following (N.B. the basic 



computational model in this book, equivalent to the universal Turing machine, is the 
universal register machine (URM)): 

The evidence for Church's thesis, which we summarise below, is impressive. 

1. The Fundamental result: many independent proposals for a precise for- 
mulation of the intuitive idea led to the same class of functions, which 
we have called C. 

2. A vast collection of effectively computable functions has been shown 
expHcitly to belong to C [...] 

3. The implementation of a program P on the URM to compute a function 
is clearly an example of an algorithm; thus... we see that all the functions 
in C are computable in the informal sense. Similarly with all the other 
equivalent classes, the very definitions are such as to demonstrate that 
the functions involved are effectively computable. 

4. No one has ever found a function that would be accepted as computable 
in th e informal sense, that does not belong to C. 

Jc^iand. ,198fL p.67) 

The point is that all this evidence, while certainly telling us something important, has 
no implications at all for the question of what the bounds of physical computability 
are — on the question of what we can get physical systems to do for us. It simply 
points to the fact that Church, Turing and others did indeed succeed (amazingly well) 
in making precise the intuitive notion of effective calculability. And note that the facts 
cited are not really evidence for a hypothesis, but rather emphasize that the Church- 
Turing definition, or stipulation, does not lead to conflict with any pre-theoretic notions 
of effective calculability. These facts are not evidence, then, but are reasons why this 
definition is both a very good and a remarkably powerful one. 

The unimpeachable status that the Church- Turing hypothesis enjoys does not, there- 
fore, impugn (nor could it be impugned by) the possibility of physical computational 
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models that go beyond Turing computability (the example of Malament-Hogarth com- 
putability gave us a concrete example); the areas of concern are quite distinct. It follows 
that one shouldn't seek to use the Church- Turing hypothesis as a restricting principle 
on physical laws. 

However a prominent example of this unfortu r iate in ode of reasoning in the context 
of quantum mechanics is to be found in iNielserJ l|l997l) , for example. The abstract to 



this paper states: 

We construct quantum mechanical observables and unitary operators which, 
if implemented in physical systems as measurements and dynamical evolu- 
tions, would contradict the Church- Turing thesis which lies at the heart of 
computer science. We conclude that either the Church- Turing thesis needs 
revision, or that only restricted classes of observables may be realized, in 
principle, as measurements, and that only a restri cted class of u nitary oper- 
ators may be realized, in principle, as dynamics. llNielsenl . ll997h 

To give a flavour of the approach: the author begins by considering an observable 
defined by 

oo 

h = ^h{x)\x){x\, 

where is an orthonormal basis for some physical system with a countably infinite 

dimensional Hilbert space (e.g. the number states of a particular mode of the e-m field), 
and h{x) is the characteristic function for the halting problem. We may suppose that 
the various \x) states can reliably be prepared. Measurement of this observable on 
systems prepared in these states will then evaluate the halting function for us. Nielsen 
concludes that this would conflict with the Church- Turing hypothesis, therefore, we must 
either revise the hypothesis, or conclude that this type of measurement is not in fact 
physically possible. Given the entrenchment of the Church- Turing hypothesis, Nielsen 
opts for the latter. But the conclusion is misplaced, as the Church- Turing hypothesis 
does not rule out the possibility of non- Turing computability using physical systems; 
and the entrenchment of the hypothesis does not rest on empirical evidence about what 
can be computed by physical systems^^. 

^^In fact, one can raise a further problem for this example of Nielsen's — it is not clear that it would 
constitute an example of non- Turing computability. In order to perform the measurement corresponding 
to the operator h, we need to be able to pick out the correct piece of equipment in the lab. But in 



CHAPTER 6. QUANTUM COMPUTATION AND THE C-T HYPOTHESIS 170 

It is sometimes suggested that part of the meaning of the slogan 'Information is 
Physical' for the quantum information scientist is to encapsulate the recognition of the 
need to go beyond the Church- Turing hypothesis in the theory of computation. Our 

reflections in this chapter give the lie to this conception, however. It is based on an 
equivocation between the task of characterizing the effectively calculable functions — the 
task of the Church- Turing hypothesis — and the distinct task of investigating the bounds 
of physical computability. 



order to do this one would already have to have evaluated the halting function (imagine a shelf in the 
lab with a series of apparatuses all of which measure in the basis, but have different eigenvalue 

spectra associated with them). Thus the outlined procedure would not count as an effective procedure, 
as one can't pick out the desired piece of apparatus by an effective procedure. In essence, the solution 
to the halting function has been hardwired into the apparatus, but we can't get at it unless we already 
have the solution. 



Chapter 7 



Morals 

It is time to draw together some morals from the preceding chapters. 

The first is the simple statement that the everyday and the information-theoretic 
notions of information are indeed distinct and are to be kept apart. When this is done, 
I believe, we have a significant counter-weight to the rather breathless sorts of claims 
remarked on in the introduction, to the effect that the advent of quantum information 
theory heralds a revolution for our world picture in which the material will be replaced 
as fundamental by the immaterial: information. 

When we recognise that quantum information theory is not concerned with, and has 
no implications for, information in the everyday sense of the word, that is, for a notion 
of information involved with semantic and epistemic concepts, it can be seen that these 
claims for the implications of quantum information theory, which are concerned with 
our epistemological position and the meaning of our discourse, seem groundless. 

The further point that it is important to keep the distinction between the technical 
and everyday notions of information clear when approaching foundational questions in 
quantum mechanics will be remarked on again, briefly, in the next chapter. 

The second moral is a very general one regarding the nature of quantum information 
theory. It has been maintained throughout Part I that 'information' is an abstract noun. 
We saw, in particular, the value of being clear on this fact in the quantum case when 
considering the topic of teleportation. The conceptual troubles this process had seemed 
to present were revealed to be the result of the old philosophical error of hypostatizing an 
abstract noun. But where, we might ask, does this leave quantum information theory? 

171 
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Without a subject matter? Indeed not. 

The correct conclusion, it seems to me, is this: Quantum information theory is best 
seen not as a theory of some strange new thing — quantum information — but rather as a 
new type of information theory. That is, a theory of communication and computation 
which identifies new types of physical resource — qubits, shared entanglement, etc. — and 
correspondingly identifies new sorts of task that can be achieved using them, along with 
some different (and hopefully, in at least some cases, better) ways of managing old tasks. 
The subject matter of the theory, if it is demanded that such be found, can simply be 
said to be these new types of physical resources and the tasks that can be achieved using 
them. 

One might say that it is a question of bracketing. Quantum information theory is 
not a 

(quantum information) theory 
but a 

quantum (information theory). 

Our final moral concerns the slogan 'Information is Physical'. It might be recalled 
that having discussed the dilemma this proposition faces, I rather left the conclusion 
that should be drawn hanging. Let me make that good now. Recall that the dilemma 
was produced by the fact that if 'information' in the slogan was supposed to advert to 
information in the everyday sense of the word, then the claim would seem to involve a 
commitment to the project of semantic naturalism. But it was argued that the success 
and practice of quantum information theory has no implications for this purely philo- 
sophical question; and no more would the outcome of the philosophical debate have 
any implications for quantum information theory. If, though, the slogan is supposed 
to refer to information in the technical sense of Shannon theory, then it is hard to see 
how the claim that some physically defined quantity is physical is in the least enlighten- 
ing. (Alternatively, I suggested that the whole thing could simply be seen as a category 
mistake.) 

The resolution of the dilemma is this: the claim 'Information is Physical' as made 
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by a quantum information scientist should not, I suggest, be construed as an ontological 
claim, but rather as a methodological one. It does not represent a claim about how the 
world is, or represent an insight into the nature of information, but it should be seen, 
rather, to express a commitment characteristic of the disciphnc. Roughly speaking, 
the view that it is a very interesting and fruitful business to study the information 
carrying, storing and processing capabilities of physical systems as described by our 
most fundamental physical theories. A priori this need not have been so, but the 
vibrant health of quantum information science assures us, emphatically, that in the case 
of quantum systems, it most certainly is. 



Part II 

Information and the 
Foundations of Quantum 
Mechanics 
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Information theory has, in the last few years, become something of a scientific 
bandwagon... 

Although this wave of popularity is certainly pleasant and exciting for those 
of us working in the field, it carries at the same time an element of danger. 
While we feel that information theory is indeed a valuable tool... it is certainly 
no panacea for the communication engineer or, a fortiori, for anyone else. 
Seldom do more than a few of nature's secrets give way at one time. It will be 
all to easy for our somewhat artificial prosperity to collapse overnight when 
it is realised that the use of a few exciting words like information, entropy, 
redundancy, do not solve all our problems. 
llShannonLfl95l 



Chapter 8 

Preliminaries 



8.1 Information Talk in Quantum Mechanics 

Shannon's words above represent a salutory warning for those of us interested in the 
question of whether quantum information theory has imphcations for the foundational 
problems of quantum mechanics. Is it, perhaps, that we have become overly excited by 
the appearance of a few trigger words (information, uncertainty, entropy...) in books, 
journals and pre-print servers dedicated to quantum theory? Compare on the other 
hand, Fuchs: 

...no tool appears better calibrated for a direct assault [on quantum foun- 
dations] than quantum information theory. Far from a strained application 
of the latest fad to a time-honored problem, this method holds promise pre- 
cisely because a large part — but not all — of the structure of quantum theory 
has always concer ned informati on. It is just that the physics community 
needs reminding. ijFuchsl . l2002a|) 

In this brief chapter I shall set out a few preliminaries; some points that are rather 
basic, but essential when trying to see what can be made of information talk in quantum 
mechanics. 

Appeal in some form to the notion of information as a way of addressing the concep- 
tual problems presented by quantum mechanics has been a recurrent feature of many 
discussions of the quantum foundations, particularly for those in the Copenhagen tra- 
dition; and this trend has been reinvigorated following the growth of q uantum informa- 



tion theory. For a selection of more recent statements see, for example, 
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Very often, the suggestion proceeds along the lines that the traditional problems of 
measurement, nonlocality and so on, are resolved when one recognises that the quantum 
state should simply be viewed as representing one's knowledge or information rather than 
any objective property of the world. A representative formulation is the following, due 
to Hartle: 



The state is not an objective property of an individual system but is that 
information, obtained from a knowledge of how a system was prepared, which 
can be used for making predictions about future measurements. 

...A quantum mechanical state being a summary of the observers' information 
about an individual physical system changes both by dynamical laws, and 
whenever the observer acquires new information about the system through 
the process of measurement. The existence of two laws for the evolution 
of the state vector... becomes problematical only if it is believed that the 
state vector is an objective property of the system... The "reduction of the 
wavepacket" does take place in the consciousness of the observer, not because 
of any unique physical process which takes place there, but only because the 
state is a constr uct of the ob server and not an objective property of the 
physical system. l|Hartlel HqbI p.709)i 



As so often in the foundations of quantum mechanics, however, it is instructive to 
turn to the writings of John Bell; and there we find a warning. For 'information' is on 
Bell's famous list of b ad words that 'have no place in a formulation with any pretence 
to physical precision' I Bell 1990 , p. 34)^. Bell indicates the pertinent sources of disquiet 
with two rhetorical questions: Information about What?; Whose information? 

These are indeed good questions, and the first most especially. For it presents a 
fundamental dilemma: the Scylla and Charybdis facing proponents of information talk 
in quantum mechanics. 

^It may be noted in passing that Hartle's argument for these propositions is by no means entirely 
persuasive. While there is not room to go into details here, suffice it to say that his argument for 
construing the state of a system as information trades on an ambiguity between specifying what a 
state is (e.g., an assignment of truth values to experimental propositions) and specifying what state 
something is in; moreover, a realist opponent can always insist that the quantum state only allows 
us to make predictions about the behviour of a system precisely because it corresponds to a system's 
possessing certain objective properties. 

^To illustrate Bell's use of the term 'formulation': "Surely, after 62 years, we should have an exact 
formulation of some serious part of quantum mechanics? By 'exact' I do not of course mean 'exactly 
true'. I mean only that the theory should be formulated in mathematical terms, with nothing left to 
the discreti on of the theoretical physicist. ..until workable approximations are needed in applications." 
<Bellll99d p.33). 



CHAPTERS. PRELIMINARIES 178 

If the quantum state is to be construed in terms of representing one's information 
then it seems that there are two possible sorts of answer that could be given to the 
question 'Information about what?': 

1. Information about what the outcomes of experiments will be; 

2. Information about how things are with a system prior to measurement, i.e., about 
hidden variables. 

Now the latter option is unlikely to be attractive to anyone who is trying to appeal to 
information as a way of avoiding the problems caused by the seemingly odd behaviour 
of the quantum state. The aim, roughly speaking, was to circumvent the problems 
associated with collapse or nonlocality by arguments of the form: there's not really any 
physical collapse, just a change in our knowledge; there's not really any nonlocality, 
it's only Alice's knowledge of (information about) Bob's system that changes when she 
performs a measurement on her half of an EPR pair. But we all know that if we are 
to have hidden variables lurking around then these are going to be very badly behaved 
indeed in quantum mechanics (nonlocality, contextuality) . So it surely can't be this 
second answer that our would-be informationist is really after. 

But now consider the first answer. If the information that the state represents is in- 
formation about what the results of experiments will be, then the difficulty is now to say 
anything interesting that doesn't simply slide into instrumentalism. Instrumentalism, of 
course, is the general view that scientific theories do not seek to describe the laws gov- 
erning unobservable things, but merely function as devices for predicting the outcomes 
of experiments. An instrumentalist view of the quantum state understands the state 
merely as a device for calculating statistics for measurement outcomes: this is very close 
to the view that the state merely represents information about what the results of mea- 
surements will be. But if all that appeal to information were ultimately to amount to is 
a form of instrumentalism, then we would not have a particularly interesting — and cer- 
tainly not a novel — intrepretational doctrine. It should be noted that merely presenting 
an old doctrine such as instrumentalism in the currently popular idiom of information 
does not make it any more (or any less, admittedly,) of an attractive doctrine. (Here 
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Shannon's warning is very pertinent.) 

Thus the dilemma. To present a distinctive and hence an interesting doctrine, it 
seems that the proponent of information has somehow to steer a course that avoids 
hidden variables, yet does not merely amount to instrumcntalism; but it is not clear 
that this is easily done. 

One option might be the following: one could emphasise that in contrast to standard 
instrumcntalism, the focus of one's interest is individual systems rather than the statis- 
tics of measurements for ensembles. But this approach suffers from a decisive objection. 

For there is a very important further subtlety that needs to be highlighted if one 
is interested in viewing the quantum state as representing an agent's knowledge or 
information. This is the point that — to use the philosophical jargon — both the terms 
'knowledge' and 'information' are factive. That is, one can't know that p unless p is 
the case; one can't have the information that p unless it is true that p. The major 
difficulty this presents for those who may have hoped to avoid the conceptual problems 
of quantum mechanics by understanding the quantum state of an individual system in 
terms of information is that the factivity of knowledge and information entails precisely 
the sort of objectivity that the invocation of information was originally intended to 
bypass. 

The straightforward instrumentalist seeks to avoid the problems associated with mea- 
surement and nonlocality by remaining at the level of statistics only: individual systems 
are not described and collapse doesn't correspond to any real process. So far as it goes, 
this strategy is reasonably successful"^. For someone taking the information route and 
associating a quantum state with individual systems, however, the essence of their ap- 
proach is that different agents can ascribe different states to a given quantum system, 
because they have different inform ation regarding it. Thus in the Wigner's friend sce- 



nario, for example, IjWiEfner 



19611) rather than there having to be a mysterious collapse 
at some point, both agents involved simply ascribe different states to the system being 
measured. There is not supposed to be one correct state which is in some sense an objec- 
tive property of a system, rather, each agent will ascribe a different state based on their 

^See Saunders for some criticisms of instrumentalism as a solution to the problem of mea- 

surement, though. 
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differing information (whether they are inside the lab doing the measurement, or waiting 
patiently outside for their friend). Similarly in the EPR case, Alice's measurement is 
understood not to change any real properties of Bob's system; her measurement merely 
provides her with some particular information about it, in virtue of the correlations 
involved in the inital entanglement. Post measurement, she will ascribe a new state to 
Bob's system — which is located at a distance — but since the state does not correspond 
to an objective property of the system, this does not connote nonlocality. Indeed, Bob 
continues, all the while, to ascribe the same old state (density operator) to his system 
as ever, until he performs a measurement of his own, or gets in touch with Alice. 

But the factivity of information and knowledge puts paid to these forms of argu- 
ment: if the quantum state represents what one knows, or what information one has, 
then things have to be as they are known to be. For example, if I know what the proba- 
bility distributions for the outcomes of various measurements on a system arc, then the 
probabilities must indeed be thus and so. We have a matter of right or wrong determined 
by what the properties of a system actually are. If Alice performs a measurement on her 
half of an entangled pair in the singlet state and subsequently knows the pure state of 
Bob's system, then his system objectively has to be in that state. Alice now knows that 
a particular experiment will have some outcome as a certainty, whereas before it didn't; 
and this is a determinate matter of fact. Thus we end up, in this approach, having to 
talk again about objective properties of a system, and objective properties that can be 
changed at a distance, even after making our appeal to knowledge and information talk. 
No progress is thus made with the conceptual problems in this direction; the approach 
is a blind alley. 

I have so far emphasised only one of Bell's questions. The point of the second, 'Whose 
information?' is presumably to highlight what Bell felt would be an unnacceptable level 
of vagueness associated with use of the term 'information', if it were to occur in a pu- 
tative formulation of fundamental theory. This vagueness could be seen to come from 
two different directions: first, a vagueness of anthropocentrism (how are we to specify 
with any precision what counts as a bona fide cognitive agent?); second a vagueness 
associated with subjectivity (different agents might occupy different perspectives, per- 
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haps) although this sort of worry is to some degree moUified by reahsing the factivity of 
information ^. 

The dangers of amounting to no more than a form of instrumentaUsm and the factiv- 
ity of the terms 'knowledge' and 'information' are the first two preliminary considerations 
that need to be borne in mind when assessing information based approaches to quantum 
mechanics. The third and final one is as follows. It was emphasised in Part I that the 
everyday notion of information — with its links to knowledge, language and meaning — is 
to be firmly distinguished from the technical notion of information that arises in infor- 
mation theory. The latter is not a semantic or an epistemic concept; and pace Dretske, 
considerations of mechanical communication systems would seem to have precious little 
to do with explaining semantic and epistemic properties. Now, keeping the distinction 
between the everyday and the technical notions of information clearly in mind is cru- 
cial when considering the role that quantum information theory might have to play in 
the foundations of quantum mechanics, for otherwise one may easily fall prey to some 
serious misconceptions. 

For example, it might well be thought that it is simply obvious that quantum in- 
formation theory will shed light on the interpretive problems of quantum mechanics. 
After all, the key conceptual problem in quantum mechanics is the problem of mea- 
surement; but what is measurement other than a transfer of information, an attempt to 
gain knowledge? As we are now equipped with a theory of information in the quantum 
domain, enlightenment is sure to follow! 

* Interestingly, [Menpin |2001lJ, developing an idea due to iPeierl^ ll99lf) . has sought to respond 
to the challenge presented by the 'Whose information?' question, by deriving conditions under which 
different density matrices can be thought to rep resent different knowledge that various agents might 
have about one and the same system (see also Mcrmin 1 2002tl . fBrun et al.l 120021) ). This approach has 
rightly been criticised by Fuchs, however (sec Fuchs (2002b,, esp. pp.l9-25;42-51); and also Caves ct al] 
(22213)) the grounds that any approach in this vein, that involves assessing whether an agent's 
ascription of a state to a system is correct, or admissible, or what-not, amounts to giving up on the 
original desire for non-objectivity of the state that was supposed to be doing the distinctive conceptual 
work. If there is a question, ultimately, of being right or wrong, then one might as well openly admit 
that the quantum state is objective after all. In essence, the point here may be put in terms of factivity 
again: if we imagine different knowledge that people might have about a system and the different states 
they may assign on the basis of that knowledge, then there must exist determinate facts about the 
system that each of them is, to a greater or lesser degree, aware of. Although he does not himself put 
it in these terms, Fuchs' awareness of the factivity of the terms 'knowledge' and ' information' and his 
related criticism of Mermin, mark the change fr om the objecti ve Bayesianism of iFuchj ^2001^ to the 
more consistent subjectively Bayesian position of lFuchj j2002aD . 
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This line of thought rests, of course, on a flagrant confusion between information in 
its everyday and its technical senses; between an epistemic and an information-theoretic 
sense of information. The following is a concrete example of just such a confusion: 

Quantum measurements are usually analyzed in abstract terms of wave- 
functions and Hamiltonians. Only very few discussions of the measurement 
problem in quantum theory make an explicit effort to consider the crucial 
issue — the transfer of informatio n. Yet obtain ing knowledge is the very rea- 
son for making a measurement, p.viuj 

However, if any link is to be established between the techniques and applications of 
quantum information theory and the conceptual puzzles of quantum mechanics, it is not 
to be achieved by a facile equation of radically different senses of the term 'information'. 
Here, more than anywhere, we need to be vividly aware of Shannon's warnings about 
getting over-excited by a few heavily-loaded terms; and we need to be on the look-out 
to make sure no-one is being misled by an implicit or explicit slide between different 
senses of the term 'information'. 

With these preliminary reflections behind us, we shall turn, in the next chapter, to 
consider some specific proposals for the application of information-theoretic ideas to the 
foundational problems of quantum mechanics. 



Chapter 9 

Some Information-Theoretic 
Approaches 



If one of the prima facie difficulties faced by attempts to appeal to notions of information 
in approaching foundational questions in quantum mechanics is that of avoiding an uned- 
ifying descent into instrumentalism, then where else may we hope to make progress with 
the project? One obvious avenue for attack is to investigate whether ideas from quan- 
tum information theory might help provide a perspicuous conceptual basis for quantum 
mechanics, perhaps by leading us towards an enlightening axiomatisation of the theory. 
Certainly, strikingly different possibilities for information transfer and computation are 
to be found in quantum mechanics when compared with the classical case, and might 
these facts not help us characterize how and why quantum theory has to differ from 
classical physics? 

The thought that ideas from quantum information might lead us towards a trans- 
parent conceptual basis for quantu m mechanics h as been expressed perhaps most pow- 



erfully by Fuchs and co-workers (cf. 



Fuch! 



200.'! ). In this chapter, we shall investigate 



two particular approaches in this vein, the Foundational Principle of Zeilinger; and the 
information-theoretic characterization theorem of Clifton, Bub and Halvorson. 
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9.1 Zeilinger's Foundational Principle 

In ChapterEl Brukner and Zeilinger's attempted criticism of the Shannon information in 
quantum mechanics was discussed, and as remarked there, what provides the background 
to this criticism is Zeihnger's (1999) proposal of an information-theoretic foundational 
principle for quantum mechanics. 

This foundational principle, Zeilinger suggests, is to play a role in quantum mechan- 
ics similar to that of the Principle of Relativity in Special Relativity, or to the Principle 
of Equivalence in General Relativity. Like these, the Foundational Principle is to be 
an intuitively understandable principle which plays a key role in deriving the struc- 
ture of the theory. In particular, he suggests that the Foundational Principle provides 
an explanation for the irreducible randomness in quantum measurement and for the 
phenomenon of entanglement. We will begin by considering whether the Principle can 
indeed be successful as a foundational principle for quantum mechanics, before suggest- 
ing why advocacy of the Principle might be thought to require or motivate arguments 
against the Shannon information. 

Before stating the Foundational Principle, it is helpful to identify two philosophical 
assumptions that Zeilinger's position incorporates. The first is a form of phenomenalism: 
physical objects are taken not to exist in and of themselves, but to be mere constructs 



relating sense impressions llZeilinger 



1999bl p.633)^; the second assumption is an explicit 



instrumentalism about the quantum state: 

The initial state... represents all our information as obtained by earlier obser- 
vation. ..[the time evolved] state is just a shor t-hand way of rep resenting the 
outcomes of all possible future observations. lIZeihTigeTllTflflfliI p.634) 

With these assumptions noted, let us consider the two distinct formulations of the Prin- 
ciple presented in Zeilinger (1999): 

FP 1 An elementary system represents the truth value of one proposition. 
FP 2 An elementary system carries one bit of information. 

^Here I take phenomenalism to be the doctrine that the subject matter of all conceivable propositions 
are one's own actual or possible experiences, or the actual and possible experiences of another. 



CHAPTER 9. SOME INFORMATION-THEORETIC APPROACHES 185 

At first glance, these two statements appear most naturally to be concerned with 
the amount of information that can be encoded into a physical system. However, this 
interpretation is at odds with the passage in which Zeilinger motivates the Foundational 
Principle. In this passage, his concern is with the number of propositions required to 
describe a system. He considers the analysis of a composite system into constituent 
parts and remarks that it is natural to assume that each constituent system will require 
fewer propositions for its description than the composite does. The end point of the 
analysis will be reached when we have systems described by a single proposition only; 
and it is these systems that are termed 'elementary'. 

The apparent tension between these different ideas of how FPl and 2 should be read 
is relieved when Zeilinger goes on to explain what he means by an elementary system 
carrying or representing some information: 

...that a system "represents" the truth value of a proposition or that it "car- 
ries" one bit of information only implies a statement concerning what can 
be said about possible measurement results. (jZeilinger». .1999bi . p. 635) 

Thus the Foundational Principle is not a constraint on how much information can be 
encoded into a physical system. It is a constraint on how much the state of an elemen- 
tary system can say about the results of measurement. This interpretation is rendered 
consistent with the discussion in terms of the propositions required to describe a system, 
as from Zeilinger's instrumentalist point of view, describing (the state of) a quantum 
system can only be to make a claim about future possible measurement results. Further- 
more, we can understand the peculiar idiom of a system 'representing' some information, 
where this is taken not to refer to the encoding of some information into a system, when 
we recall that from the point of view of Zeilinger's phenomenalism, a physical system 
is not an actual thing. On his view, a system represents a quantity of information 
about measurement results because a physical system literally is nothing more than an 
agglomeration of actual and possible sense impressions arising from observations. 

In short, however, it seems that a clearer, and perhaps more philosophically neutral, 
statement of the Foundational Principle would be the following: 
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FP 3 The state of an elementary system specifies the answer to a single yes/no exper- 
imental question, 

where we have used the fact that by 'proposition' Zeihnger means something that repre- 
sents an experimental question. With this relatively clear statement of the Foundational 
Principle in hand, let us now consider its claims as a foundational principle for quantum 
mechanics. 

To begin with, we should note the limitations implied by Zeilinger's conception of the 
description of a system. It might not always be the case that the state of an individual 
system can be characterised appropriately as a list of expcriinental questions to which 
answers are specified; and in such a case, the terms of the Foundational Principle cannot 
be set up. Consider the de Broglie-Bohm theory, for example, with its elements of holism 
and contextuality — even though the theory is deterministic, the results of measurements 
are in general not determined by the properties of the object system alone but are the 
result of interaction between object system and measuring device. It would seem that 
this theory could neither be supported nor ruled out by the Foundational Principle, as 
we can neither identify something that would count as an elementary system in this 
theory, given the way 'elementary system' has been defined, nor, a fortiori, begin to 
enumerate how many experimental questions such an entity might specify. However, for 
present purposes, let us put this sort of worry to one side. 

Another concern arises when considering the distinction we have drawn between 
describing a system and encoding information into it. Unlike encoding, the notion of 
describing a system presupposes a certain language in which the description is made, 
and the description of a given system could be longer or shorter depending on the 
conceptual resources of the language used. If we are to make a claim about the number 
of propositions required to describe a system, then, as we must when identifying an 
elementary system to figure in the Foundational Principle, we must already have made 
a choice of the set of concepts with which to describe the system. But this is worrying if 
the purpose of the Foundational Principle is to serve as a basis from which the structure 
of our theory is to be derived. If we already have to make substantial assumptions 
about the correct terms in which the objects of the theory are to be described, then it 
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may be that the Foundational Principle will be debarred from serving its foundational 
purpose. With this worry in mind, let us now consider the first of the concrete claims 
for the Foundational Principle, that it explains the irreducible randomness of quantum 
measurements. 

Zeilinger's suggestion is that we have randomness in quantum mechanics because: 

...an elementary system cannot carry enough information t o provide definite 
answers to all questions that could be asked experimentally l)Zeihngeilll999bl 
p.636), 

and this randomness must be irreducible, because if it were reduced to hidden properties, 
then the system would carry more than one bit of information. Unfortunately, this does 
not constitute an explanation of randomness, even if we have granted the existence of 
elementary systems and adopted the Foundational Principle. For the following question 
remains: why is it that experimental questions exist whose outcome is not already 
determined by a specification of the finest grained state description we can offer? How 
is it that any space for randomness remains? Or again, why isn't one bit enough? 

The point is, it has not been explained why the state of an elementary system cannot 
specify an answer to all experimental questions: this does not in fact appear to follow 
from the Foundational Principle. The Foundational Principle says nothing about the 
structure of the set of experimental questions, yet this turns out to be all-important. 

Consider the case of a classical Ising model spin, which has only two possible states, 
'up' or 'down'; here one bit, the specification of an answer to a single experimental 
question ('Is it up?') is enough to specify an answer to all questions that could be 
asked. There is no space for randomness here, yet this classical case is perfectly consistent 
with the Foundational Principle. Thus it seems that no explanation of randomness is 
forthcoming from the Foundational Principle and furthermore, it is far from clear that 
the Principle, on its own, in fact allows us to distinguish between quantum and classical. 

Of course, if one assumes that experimental questions are represented in the quantum 
way, as projectors on a complex Hilbert space, then even for the simplest non-trivial 
state space, there will be non-equivalent experimental questions, the answer to one of 
which will not provide an answer to another; but we cannot assume this structure if it 
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is the very structure that we are trying to derive. It appears from the way in which the 
Foundational Principle is supposed to be functioning in the attempted explanation of 
randomness, that something like the quantum structure of propositions is being assumed. 
But this is clearly fatal to the prospects of the Foundational Principle as a foundational 
principle.^ 

Does the Principle fare any better with the proposed explanation of entanglement? 
The idea here is to consider N elementary systems, which, following from the Founda- 
tional Principle, will have N bits of information associated with them. The suggestion is 
that entanglement results when all bits are exha usted in specifyin g; joint properties of 
the system, leaving none for individual subsystems jzeilinger . 1 1 999bi . or more generally, 
when more information is used up in specifying joint properties than would be possible 
classically. The underlying thought is that this approach captures the intuitive idea that 
when we have an entangled system, we know more about the joint system (which may 
be in a pure state) than we do about the ind ividual sub-systems ( which must be mixed 



Brukner et al 



( 200 ij) , where Brukner and 



states). The proposal is further developed in 
Zcilinger's information measure is used to provide a quantitative condition for N qubits 
to be unentangled, which is then related to a condition for the violation of a certain 
TV-party Bell inequality. 

To give a basic example of how the idea is supposed to work, consider the case of 
two qubits. Recall that the maximally entangled Bell states are joint eigenstates of 
the observables ax ® Ox and Oy ® Oy. From the Foundational Principle, only two bits 
of information are associated with our two systems, i.e., the states of these systems 
can specify the answer to two experimental questions only. If the two questions whose 
answers are specified are 'Are both spins in the same direction along xT (1/2(1 (X) 1 -f 
(jx ® fJa;)) and 'Are both spins in the same direction along y?' (1/2(1 ®\ + dy® ctj,)), 
then we end up with a maximally entangled state. If, by contrast, the two questions had 
been 'Are both spins in the same direction along xT and 'Is the spin of particle 1 up 

^In a sense, we could say that Zeilinger's explanation of randomness is problematic because it fails 
to explain why the sta te space of quantum me chanics is so gratuitously large from the point of view 



of storing information JCaves and Fucha . Il99fif) . It is then striking that this attempted information- 
theoretic foundational approach to quantum mechanics has not allowed for one of the significant insights 
vouchsafed by quantum information theory. 
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along x?\ the information would not have all been used up specifying joint properties 
and we would have instead a product state (joint eigenstate of ax (Tx and (Tx 1). 

Now, although this idea may have its attractions when used as a criterion for entan- 
glement within quantum mechanics, it docs not succeed in providing an explanation for 
the phenomenon of entanglement, which was the original claim. 

If we return to the starting point and consider our elementary systems, all that the 
Foundational Principle tells us regarding these systems is that their individual states 
specify the answer to a single yes/no question concerning each system individually. 
There is, as yet, no suggestion of how this relates to joint properties of the combined 
system. Some assumption needs to be made before we can go further. For instance, we 
need to enquire whether there are supposed to be experimental questions regarding the 
joint system which can be posed and answered that are not equivalent to questions and 
answers for the systems taken individually. (Wc know that this will be the case, given 
the structure of quantum mechanics, but we are not allowed to assume this structure, 
if we are engaged in a foundational project."^) If this is the case then there can be 
a difference in the information associated with correlations (i.e., regarding answers to 
questions about joint properties) and the information regarding individual properties. 
But then we need to ask: why is it that there exist sets of experimental questions to 
which the assignment of truth values is not equivalent to an assignment of truth values 
to experimental questions regarding individual systems? 

Because such sets of questions exist, more information can be 'in the correlations' 
than in individual properties. Stating that there is more information in correlations 
than in individual properties is then to report that such sets of non-equivalent questions 
exist, but it does not explain why they do so. However, it is surely this that demands 
explanation — why is it not simply the case that all truth value assignments to exper- 
imental questions are reducible to truth value assignments to experimental questions 
regarding individual properties, as they are in the classical case? That is, why does 
entanglement exist? In the absence of an answer to the question when posed in this 
manner, the suggested explanation following from the Foundational Principle seems 

'^To illustrate, a simultaneous truth value assignment for the experiments (Jx^Tx and ay^<7y cannot 
be reduced to one for experiments of the form 1 (g) a.cr, b.<T ® 1. 
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dangerously close to the vacuous claim that entanglement results when the quantum 
state of the joint system is not a separable state. 

Of course, if we are in the business of looking within quantum mechanics and asking 
how product and entan gled states diffe r , then it is indeed legitimate to consider some- 



thing like the condition 



Brukner et al 



1 2001^ propose; and we can then consider how 



good this condition is as a criterion for entanglement*. But as mentioned before, if we 
are trying to explain the existence of entanglement then we cannot simply assume the 
quantum mechanical structure of experimental questions. 

Let us close by considering a final striking passage. Zeilinger suggests that the Foun- 
dational Princip le might provide an answer to Wheeler's question 'Why the quantum?' 



I Wheeler 




19901) in a way congenial to the Bohrian intuition that the structure of quan- 



tum theory is a consequence of limitations on what can be said about the world: 

The most fundamental viewpoint here is that the quantum is a consequence 
of what can be said about the world. Since what can be said has to be 
expressed in propositions and since the most elementary statement is a single 
proposition, quantization follows if the mo st elementary system represents 
just a single proposition. llZeilingeiill999tl p.642) 

But this passage contains a crucial non-sequitur. Quantization only follows if the propo- 
sitions are projection operators on a complex Hilbert space. And why is it that the world 
has to be described that way? That is the question that would need to be answered in 
answering Wheeler's question; and it is a question which, I have suggested, the Founda- 
tional Principle goes no way towards answering. 



9.1.1 Word and world: Semantic ascent 

At this juncture let us pause to consider the following paranthetical, but perhaps illu- 
minating, remarks. 

The sentiment expressed in the last quotation of Zeilinger is evidently very close to 
that captured by the famous (or infamous) statement attributed to Bohr by Petersen: 

''At this point it is worth noting that there have been other discussions of entanglement which 
develop the intuitive idea that when faced with entangled states, we know more about joint properties 
than individual propertie s. As we saw in Section 15.1.21 a very general framework is presented by 
iNielsen and Kemod <2nniD . who use the majorization relation to compare the spectra of the global and 
reduced states of the system; a necessary (but not sufficient) condition for a state to be separable is 
then that it be more disordered globally than locally. 
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There is no quantum world. There is only an abstract quantum physical 
description. It is wrong to think that the task of physics is to find out how 
nature is. Physics concerns what we can say about nature. l)PetersenLll96l 
p.l2) 

The last sentence is particularly pertinent: 'Physics concerns what we can say about 
nature.' Compare again, another statement of Zeilinger's, '...what can be said about 



Nature has a constitutive contribution on what can be "real" . ' (Reported in 



Fuchs 



(200a p.615)). 

These views clearly pick out one strand of thought that can be seen to contribute to 
the wider speculative thesis that information may, in some sense, provide a new way in 
physics. If quantum mechanics reveals that the true subject matter of physics is what 
can be said, rather than how things are, it seems but a small step from there to the view 
that what is fundamental is the play of information. 

However, there is a very obvious difficulty with the thought that what can be said 
provides a consitutive contribution to what can be real and that physics correspondingly 
concerns what we can say about nature. Simply reflect that some explanation needs to 
be given of where the relevant constraints on what can be said come from. Surely there 
could be no other source for these constraints than the way the world actually is — it can't 
merely be a matter of language^. It is because of the unbending nature of the world 
that we find the need to move, for example, from classical to quantum physics; that 
we find the need to revise our theories in the face of recalcitrant experience. Zeilinger 
and Bohr (in the quotation above) would thus seem to be putting the cart before the 
horse, to at least some degree. Schematically, it's the way the world is (independently 
of our attempted description or systematisation of it) that determines what can usefully 
be said about it, and that ultimately determines what sets of concepts will prove most 
appropriate in our scientific theorising. 

^Of course, what statements can be made depends on what concepts we possess; and, trivially, in 
order to succeed in making a statement, one needs to obey the appropriate linguistic rules. But the 
point at issue is what can make one set of concepts more fit for our scientific theorising than another? 
For example, why do we have to replace commuting classical physical quantities with non-commuting 
quantum observables? As Quine perspicuously notes '...truth in general depends on both language 
and extra-linguistic fact. The statement "Brutus killed Caesar" would be false if the world had been 
different in certain wa ys, but it would also be false if the word "killed" happened to have the sense of 
"begat" . ' iQuind . Il953l p. 36). The world is required to provide the extra-linguistic component that will 
make one set of concepts more useful than another; furthermore, without an extra-lingustic component 
to truth, we could only ever have analytic truths — and that would no longer be physics. 
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Another point can be drawn from the Petersen quotation. With its focus on the level 
of physical description and what can be said about nature (as opposed to how nature 
is) this passage can be seen to provide us with an example of what is often known as 
semantic ascent. 

Semantic ascent is the move from what Carnap called the material mode to the formal 
mode, that is, roughly speaking, from talking about things to talking about words. As 
Quine says, ^semant ic ascent... is the shift from talking in certain terms to talking about 
them' l|QuineL Il96fl p. 271). Bohr, it would seem, would have us ascend from the level 
of using words within our theory, to the level of describing our descriptions. This, the 
suggestion is, is the true task of physics. 

What would such an ascent achieve? As Quine is quick to note, semantic ascent 
doesn't bake much ontological bread: 

Semantic ascent... applies anywhere. 'There are wombats in Tasmania' might 
be paraphrased as ' "Wombat " is tr ue of some creatures in Tasmania', if there 
were any point in it. (jQuin3 . [l960l p. 2 72) 

The point is this. It's true, but trivial, that if we ascend to a level at which we are 
describing what we say about nature, that is, take the physical description as our focus 
of interest, then our subject matter will no longer be the world, for we have moved from 
talking in various terms to talking about them. At this level there will, in a sense, be 
no quantum world, for we are talking about words and not the world. 

But the fact that we have ascended doesn't mean that the level we have ascended 
from goes away. The world doesn't disappear because we may be talking about the terms 
in which we describe it. It follows that one can't shirk the difhculties and mysteries of 
interpreting quantum mechanics by simply saying: 'Physics concerns what we can say 
about nature,' for, crucially, we can always ask — well, what is said? (descent after our 
semantic ascent), as well as — how do we say it? (remaining at the ascended level). 

The fact that one can always make a semantic ascent does not mean that one can 

do without the level from which ascent has been made®. Indeed, the interesting inter- 
mit might be felt, perhaps, that this is the real import of the Bohr quote, and serves to distinguish 
the quantum from the classical case: in the quantum case, we might be supposed to imagine that 
one can intelligibly kick away the lower level, having made the semantic ascent. Such a suggestion 



CHAPTER 9. SOME INFORMATION-THEORETIC APPROACHES 



193 



pretational questions concern why one should take one stance rather than another to 
claims made using terms within a theory, and the usual ranges of options (various forms 
of realism, instrunientalism and hybrids thereof) will remain open irrespective of ascent. 
It is important to realise that the semantic ascent of the Bohrian quote doesn't succeed 
in highlighting any differences between the classical world view and quantum mechanics. 
In so far as 'there is no quantum world' is true in the Petersen quotation, it would be 
true of the classical world too: it is a universal and entirely innocuous observation that 
if we ascend to the level at which we are describing our physical-theory discourse, then 
our subject matter will be words rather than world. 

The 'There is no quantum world' passage is apt to induce apoplexy in the realist- 
minded, but there seems after all no call for raised blood-pressures. When analysed as 
an example of semantic ascent, it seems that the passage is — so far as it is intelligible — 
somewhat innocuous in import. 

9.1.2 Shannon information and the Foundational Principle 

To finish the story, let us consider how Zeilinger's approach based on the Foundational 
Principle interplays with the discussion of measures of information in quantum mechan- 
ics. 

As we have noted, Zeilinger adopts an instrumentalist view of the quantum state, 
and such instrumentalist sentiments are common. Where Brukner and Zeilinger depart 
from the norm, however, is in adopting a very literal construal of the information taken 
to constitute the state, by adopting, at least in a simple case, the Hilbert-Schmidt 
representation of states: 

We describe a photon by a catalog of information ("information vector") 
i — (ii,i2) about mutually complementary propositions {'Pi,V2}- Such 
propositions are, f or example, V^ : "the p olarization of the photon is ver- 



The component ii is defined as (p—q), where p and q are the probabilities for vertical and 

('vertiginous semantic ascent', as it might be called) is incoherent however. It would amount to the 
claim that the 'descent' question 'So: what was said?' becomes unintelligible, but this would entail that 
the terms under discussion have to become entirely devoid of meaning, and as such they would have no 
role whatsoever in physics. 
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horizontal polarization respectively^. Thus, the components of the information vector i 
correspond, effectively, to the coefRcients qf in eqn. (j2.6|l . and the propositions V to the 
operators P- . 

On this conception, an amount of information in the form of probabilities has been 
associated to propositions representing the outcomes of mutually unbiased measure- 
ments; the information and the experimental propositions it is about can be read off 
directly from the Hilbert-Schmidt representation of the state, given some choice of basis 
operators (choice of complete set of mutually unbiased measurements) . Illustrating the 
general idea, if probability 1 is associated to some proposition, then the state says the 
maximum possible about the outcome of the measurement with which that proposition 
is associated; if there is a flat distribution for outcomes of a measurement, the state con- 
tains no information about it. In general the state will contain partial information about 
a number of mutually unbiased observables. Endorsing the instrumentalist line, all that 
the state is is an amount of information in this way about mutually complementary 
observables. 

Now the statements FPl and FP2 refer to an elementary system carrying or rep- 
resenting an amount of information. As we have seen, rather than being a putative 
restriction on how much information might be encoded into, or read from, a physical 
system, the Foundational Principle is intended by Zeilinger to capture a restriction on 
how much can be said about measurement outcomes, and hence, in particular, is a 
restriction on how much the state can say about measurement outcomes. 

For Zeilinger, the state will in general be constituted by amounts of partial informa- 
tion about measurement outcomes. The Foundational Principle requires that the state 
can only contain a limited amount of information, namely one bit; hence it follows that 
the amounts of partial information contained in the state, although how these are to be 
quantified has not yet been specified in detail, must add up to one bit's worth in total. 

This, however, rules out the Shannon information as the measure of the amount 
'carried' by the state about a given measurement; we know that in general we will not 
have a sum to unity for amounts of partial information conceived in the way outlined 

''For this two-dimensional quantum system, wc have here, essentially, the Bloch sphere representation. 
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(Section 1^31 ■ (As H(p) does not sum to a unitarily invariant quantity for a complete 
set of mutually unbiased measurements, we cannot guarantee that we will attain the 
value of one for any given pure state.) 

Thus the conjunction of the Foundational Principle with Brukner and Zeilinger's 
brand of literal instrumentalism about the quantum state is inconsistent with adopt- 
ing the Shannon information to measure the amomit of information 'carried' about a 
measurement in Zeilinger's sense. I suggest that it is this fact that tempts Brukner and 
Zeilinger to argue, unsuccessfully as it turns out, that the Shannon information is not 
the correct measure of information and cannot be applied in quantum mechanics. 

We may close this discussion with two final comments. First, consider what someone 
rather more realist about the quantum state might make of the Foundational Princi- 
ple. Here the information idiom would no longer be particularly enticing and a more 
precise statement of what is being expressed by the Foundational Principle in quantum 
mechanics would be natural: 

'R'FP) Any projective measurement other than in the eigenbasis of p results 
in a shorter vector in Vh(C"). 

('R'FP for 'realist' Foundational Principle.) That is, any such measurement would 
result in a more spread probability distribution; if we began with a pure state then 
post- (non-selective) measurement, the ensemble will no longer be represented by a one- 
dimensional projector. Given this statement of the Principle, we see that it is a matter 
of choice whether or not, or with which quantities, we chose to discuss the uncertainties 
associated with the probability distributions generated by the state. (Recall that in the 
Hilbert-Schmidt representation of a density operator, the probability for a measurement 
outcome is simply the projection of the vector representing the state onto the vector 
representing the measurement outcome.) 

Second, we might wonder whether the foregoing argument indicates that for the 
instrumentalist at least, I{p) does after all represent the 'correct' measure of information 
in quantum mechanics. Such a choice would appear very artificial given the close relation 
between the functioning of I{p) and H{p) discussed in Section l!^.3.2l Note, however, that 
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one could still be an instrumentalist about the quantum state while adopting 'R'FP as 
more genuinely informative than FPl and FP2. The instrumentalist is not, then, forced 
to accept I{p) as the only correct measure of information in quantum mechanics. This 
is reassuring, for as pointed out earlier, the choice of measure of information is a matter 
of convention and convenience, a mere matter of whether the quantity in question suits 
the task to hand. 

We saw in Chapter|21that Brukner and Zeilinger's arguments against the applicability 
of the Shannon information in quantum mechanics are unsuccessful; and we have now 
seen how these arguments would seem to be motivated by the conjunction of Zeilinger's 
Foundational Principle with a particular form of instrumentalism about the quantum 
state. Even if one has instrumentalist leanings, however, this does not imply that the 
Brukner-Zeilinger measure can be the only correct measure of information in quantum 
mechanics. 



9.2 The Clifton-Bub-Halvorson characterization the- 
orem 

I have argued that Zeilinger's Foundational Principle does not constitute a principle 
from which we may derive the structure of quantum mechanics, nor which allows us 
to understand the origins of entanglement and quantum randomness. In essence, it 
is silent about the structure of the set of experimental questions, yet it is this that 
turns out to be crucial. The next approach we shall consider, that of Clifton, Bub 



and Halvorson ( Clifton et al 



2003j) . provides a far happier conclusion. Their project of 
characterizing quantum mechanics in terms of three information-theoretic constraints is 
indeed successful, although it may be questioned whether all three are striclty necessary. 
I shall outline the approach here, before moving on to raise some questions concerning 
the initial assumption of a C*-algcbraic starting point; and then consider in what sense 
their axiomatic approach may be said to provide an information-theoretic interpretation 
of quantum mechanics, or to motivate such an interpretation. 
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9.2.1 The setting 



Proc e eding within a C* -alg:ebraic framework, Clifton, Bub and Halvorson ijClifton et al 



2003; 



Halvorson , 



2003(1 succeed in characterizing quantum theory in terms of three 



information-theoretic constraints. We shall call this the CBH characterization theorem. 
The constraints are these: 

1. No superluminal information transmission between two systems by measurement 
on one of them; 

2. no broadcasting; 

3. no unconditionally secure bit-commitment. 

Let us briefly review these various terms. 

First, the setting is to assume a C* -algebraic characte rization of phys ical theories (for 



a friendly introduction to this formalism, see for example lCuddeil l)1977fl 'l. A C*-algebra 
is an involutive Banach algebra B over the complex numbers satisfying ||j4*A|| = \\A\\'^ 
for every A E B. 

Some definitions: A complex algebra is a complex vetor space with an identity and 
an associative, distributive product, AB. An involution on a complex algebra yB is a 
map * : B ^ B, satisfying: 



{A*)*^A, {A + B)* = A* + B* , {\A)*=\*A*, {AB)* ^ B* A* , \fA,BeB. 

A Banach algebra is an algebra equipped with a norm such that ||j4_B|| < ||yl||||_B||, 
complete in the norm topology. 

An element of B is self-adjoint if A* = A. A familiar example of a C*-algebra 
is given by the set B{'H) of bounded linear operators on a Hilbert space H, where the 
involution operation * is the familiar adjoint ^. The self-adjoint elements of a C*-algebra 
are usually interpreted as observables. 

In a C*-algebra, a state, to, is a linear functional on the C*-algebra that is i) positive, 
uj{AA*) > 0, and ii) normalized, w(l) = 1. The state is to be understood as ascribing 
expectation values to the elements of the algebra corresponding to observable quantities. 

In this framework, the schematic picture of a physical theory involves 'black box' 
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preparation and measuring devices. A fixed preparation procedure in the lab will give 
rise to certain observed average values for measurements using a range of devices; sys- 
tems prepared in this way will correspondingly be assigned a particular state, uj. The 
measuring devices themselves are associated with elements of the algebra corresponding 
to observable quantities: we can imagine black boxes in the lab with the letters ''A\ 'i?', 
'C and so on, inscribed on their surfaces, where A, _B...are self-adjoint elements of a 
C*-algebra.^ 



Finally, 



Clifton et al 



1 20n,'l l assume the most general form of dynamical evolution, 
viz., non-trace increasing completely positive maps (cf. Section l5.1.1|l . 

By 'a quantum theory', Clifton, Bub and Halvorson mean a theory formulated in 
C*-algebraic terms for which the algebras of observables pertaining to distinct sys- 
tems commute, for which the algebra of observables on an individual system is non- 
commutative, and which allows space-like separated systems to be in entangled states. 
Roughly speaking, these characteristics are associated respectively with the first, second 
and third information-theoretic constraints. Now, while there is clearly much, much 
more to quantum theory than these rather abstract algebraic features, it is nonetheless 
plausible to argue that together they do capture the distinctive structural features of 
the theory. 

It is of course an important pre-supposition of the general argument that the C*- 
algebraic approach be a sufRciently general one, and Clifton, Bub and Halvorson argue 
accordingly, e.g.: 

...it might seem that C*-algebras offer no more than an abstract way of talk- 
ing about quantum mechanics. In fact, the C*-algebraic formalism provides 
a mathematically abstract characterization of a broad class of physical the- 
ories that includes all classical me chanical pa rticle and field theories, as well 
as quantum mechanical theories. l)Bubll20M p. 245) 

Thus, as well as reflecting that the set of bounded operators on a Hilbert space is 
a C*-algebra, and that via the Gelfand, Naimark and Segal (GNS) construction and 
the Gelfand-Naimark theorem, we know that every abstract C*-algebra has a concrete 
faithful (i.e., isomorphic) representation as a *-subalRebra o f the bounded operators 



on some appropriate Hilbert space Ti. (cf. 



Clifton et al 



2003), it is pertinent to point 
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out that classical phase space theories may be formulated in C*-algebraic terms, and 



moreover to note that it may be sho wn that every commutat i ve C* -algebra may be 



2004^ . However, as we 



given a phase space representation (cf. IClifton et all I2fl0,l: iBub , 
shall shortly see, some questions can nevertheless be raised about whether the starting 
assumption of a C*-algebraic framework may perhaps be overly strong. 

Turning now to the constraints featuring in the characterization theorem. The very 
first, and a non-information theoretic one, is a constraint not yet mentioned, intended to 
capture the idea that if wc have two sub-algebras A and S of a C*-algebra C, whose self- 
adjoint elements are to represent, respectively, the observables of two dis tinct systems 



Clifton et al 



A and B, then we need to ensure that A and B are distinct obiects 

n 

Ij2fl0,l) adopt the notion of C*-independence to this end, the criterion being that the 
preparation of any state of A has to be compatible with preparation of any state of B. 
That is, for any state pi of A and for any state p2 of B, there is some joint state p of 
the joint algebra A\/ B such that — pi and pja = P2. (The significance of re quiring 



Halvorson and Bubl |200a).) 



a notion of independence of this sort is elaborated in 

The first of the information-theoretic constraints, no superluminal signalling via mea- 
surement, is fairly self-explanatory, corresponding to the no-signalling via entanglement 
feature in ordinary quantum mechanics. The requirement is that the state of system 
B, say, should be unaffected by any (non-selective) operation performed on the other 



system. 



Clifton et al 



l|2003j) show that this will hold iff the algebras A and B commute 
(kinematic independence). 

The property of no broadcasting, the second of the thre e constraints, is a gene ralisa- 



1996a). The 



tion of the idea of no cloning appropriate to mixed states ijBarnum et al. 
requirement on a cloning device was that it take as an input a system in any arbitrary 
state I a) and return two systems, each in the state \a). Now, one might consider instead 
a process which takes as an input a system in a state p and returns as an output a pair 
of systems A and B with a joint state p^^, which may not be equal to p (g) p, but for 
which the reduced states of A and B are equal to p, TrB/5^^ = TrAP^^ — p. Such a 



process is termed broadcasting. (Clearly, it represents a more ge neral process on l y when 



the input state is mixed; for pure states it reduces to cloning.) 



Barnum et al 
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showed that in quantum mechanics, broadcasting is possible for a set of states pi iff 
th ey are commuting. 



Chfton et al 



(2i}(}2|) first generahse the notion of broadcasting to the setting of C*- 
algebraic states, and then prove that if A and B are abehan, then there is an operation 
on Ay B that broadcasts all states of A, while, conversely, if for each pair {poiPi} of 
states of A^ there is an operation on A\/ B that may broadcast this pair, then A is 
abelian. 

So, thus far it has been proved that for a C*-algebraic theory, if it satisfies no- 
signalling and no-broadcasting, it must have algebras of observables that are non- 
commuting for individual systems, while observables for distinct systems commute; and 
conversely. 

The third information-theoretic constraint — no bit-commitment — takes a little more 
explaining. A bit-commitment protocol is an information-theoretic protocol in which 
one party, Alice, provides another party. Bob, with an encoded bit value (0 or 1) in such 
a way that Bob may not determine the value of the bit unless Alice provides him with 
further information at a later stage (the 'revelation' stage), yet in which the information 
that Alice initially gives to Bob is nonetheless sufficient for him to be sure that the bit 
value he obtains following revelation is indeed the one that Alice commited to initially. 
An illustrative analogy would be a case in which Alice chooses a bit value and writes 
it on a piece of paper. She then locks the piece of paper in a safe and delivers the safe 
to Bob, but keeps the key to the safe herself. Bob may not immediately determine the 
value of the bit as the paper is locked in the safe, but he does know that when Alice 
later gives him the key, the bit value he will learn after opening the safe and reading 
the paper is indeed the one that Alice wrote down earlier. An insecure bit-commitment 
protocol is one in which either party can cheat: Bob, by determining something about 
the encoded bit value prior to revelation, or Alice, by remaining free to reveal either bit 
value at will at the revelation stage. 

Bit-commitment is not unconditionally secure classically because the encrypted in- 
formation that Alice initially provides to Bob will always display s ome bias towa r ds the 



encoded bit value that will allow Bob to cheat. It was shown by 



Lo and ChaiJ ||1 
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and iMavera (jl; 




that bit-commitment is not secure in the quantum mechanical case 



either, but importantly, for a very different reason. 

In ordinary quantum mechanics we are familiar with the idea of the ambiguity of 
density operators: quite different preparation procedures may give rise to the same 
density operator, and one will not be able to determine which preparation procedure 
was used by performing measurements on the systems prepared. This seems to suggest 
a way in which quantum bit-commitment might be possible. If Alice were to associate 
her commitment with two different preparations of a given density matrix, then Bob 
would not be able to determine anything about the bit value thus encoded; if Alice later 
tells him the preparation procedure she used, then we might be able to arrange things 
so that Bob can check that she is true to her word in having previously commited to a 
specific bit value. 

An example might go like this. Consider a spin-1/2 system: a 50/50 mixture of 
spin- up and spin-down in the z-direction is indistinguishable from a 50/50 mixture of 
spin- up and spin-down in the x-direction — both give rise to the maximally mixed density 
operator ^1. Alice might associate the first type of preparation with a commitment and 
the second with a 1 commitment. Bob, when presented with a system thus prepared will 
not be able to determine which procedure was used. Alice also needs to keep a record 
of which preparation procedure she employed, though, to form part of the evidence 
with which she will convince Bob of her probity at the revelation stage. Thus, for a 
commitment, Alice could prepare a classically correlated state of the form: 



commitment: 



Po' = 2 ^' ^^^^^^ ' ® ' ^^^^^^ ' + ' ^^^^^^ ' ® ' ^^^^^^ ' 



whilst for a 1 commitment, she could prepare a state 



1 commitment: 



P\' =2(1 t=.)(t- I ® I t-)(t- I + I I ® I I) • 



System 2 is then sent to Bob. 

At the revelation stage, Alice declares which bit value she commited to, and hence 
which preparation procedure she used. The protocol then proceeds in the following 
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way: If she committed to 0, Alice and Bob both perform cr^ measurements and Ahce 
declares the result she obtains, which should be perfectly correlated with Bob's result, 
if she really did prepare state pj^. Similarly, if she committed to 1, Alice and Bob both 
perform ax measurements and Alice declares her result, which again should be perfectly 
correlated with Bob's result, if in truth she did prepare state . If the results reported 
by Alice and obtained by Bob don't correlate then Bob knows that Alice is trying to 
mislead him. 

The trouble with this otherwise attractive protocol is that Alice is able to cheat 
freely by making use of what is known as an EPR cheating strategy. Thus, rather 
than preparing one of the states pj^ or p}^ at the commitment stage, Alice can instead 
prepare an entangled state, such as the Bell state 10^) 12- The reduced density operator 
for Bob's system will still be ^1, but Alice can now simply wait until the revelation 
stage to perform a suitable measurement on her half of the entangled pair and prepare 
Bob's system at a distance in whichever of the two different mixtures she chooses. 

It turns out that this sort of EPR cheating strat e gy wi ll always be av ailabl e for 




Maver! 



(19971); see 



Bub 



any q uantum bit-commitment protocol f Lo and Chad l|199 
1 200lj) for a detailed discussion): the possibility of preparing entangled states shared 
between Alice and Bob rules out unconditionally secure bit-commitm ent in quantum 



mecha nics. The resu lt in the gen e ral ca se relies upon the theorem of 



(I993), prefigured in 



Hughston et al 



Schrodingen l|1936l) . which tells us that for a bipartite quantmn 
system, any mixture of states on one system may be prepared by performing a suitable 
measurement (which may involve an ancilla) on the other system, when the pair are 
in an appropriate entan gled state (v i z., one giving the correct reduced state for the 



first system). Following 



Schrddingen l|1935a . 



1936|) this phenomenon associated with 



entanglement is often called remote steering. 

The intuitive role for the no bit-commitment axiom in an attempted information- 
theoretic characterization of quantum mechanics is then as follows. In quantum me- 
chanics, the ambiguity of density operators seems to hold out the possibility of secure 
bit-commitment, but this possibility is vitiated by the fact that entanglement may exist 
between two widely separated parties. Now, we could consider a class of possible theo- 
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ries which were locaUy hke quantum mechanics in that they allowed ambiguous mixtures 
to be prepared, yet in which entanglement between separated systems w as ruled out 



perha ps decaying over distance — such a theory was in fact entertained by 



Schrodinger 



1 19361) as a way of resolving the EPR dilemma. Call such a theory a Schrddinger-type 
theory. In a Schrodinger-type theory, secure bit-commitment would be possible as the 
EPR cheating strategy, which relies on entangled states, cannot be employed. In order 
to cheat, we would need entanglement. 

But now suppose that in our attempted axiomatic characterisation we arrive at a 
class of theories which we know all to allow ambiguous mixtures. If we were then to add 
to our list of axioms the further requirement that bit-commitment should be impossible, 
then this would seem tantamount to picking out those theories that do contain non-local 
entanglement, as, drawing on the analogy with the familiar quantum mechanical case, we 
might expect that entanglement is required to cheat. By insisting on no bit-commitment 



in our axioms, we rule out t he Schrodinger - 



That is the intuitive idea. 



Clifton et al 



type theories from our consideration. 



1 200311 argue rigorously as follows. First they 



show that a C*-algebra A is nonabelian ijf it allows ambiguous mixtures, i.e., distinct 
mixtures of pure states giving rise to the same mixed state. As in the spin 1/2 example 
given above, such mixtures may be used as the basis for Alice's bit commitment. They 
then prove that if Alice and Bob only have access to classically correlated states (con- 
vex combinations of product states), then the bit-commitment protocol based on these 
distinct mixtures will be secure: there is no classically correlated state that will allow 
Alice to change her commitment from to 1 at the revelation stage. The contrapositive 
statement of this result is that if, for a theory in which the algebras of observables for in- 
dividual systems are nonabelian, unconditionally secure bit-commitment is not possible 
then entanglement between spatially separa ted systems must be allowed. The converse, 



that for any quantum theory in the sense of 



Clifton et al 



unconditionally secure 
bit-commitment is not possible, was proven bv iHalvorsonl l|2003 1. 

The achievement of the CBH characterization theorem then, is, first of all, a formula- 
tion of the three information-theoretic constraints in the general setting of C*-algebraic 
theories, followed by the main result of a characterization of quantum theory in terms of 
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these three constraints: Any theory formulated in C*-algebraic terms that satisfies the 
three information-theoretic constraints will take the form of a quantum theory; with a 
non-commuting algebra of observables for individual systems, kinematic independence 
for the algebras of space-like separated systems and the possibility of entanglement be- 
tween space- like separated systems; while conversely, any C*-algebraic theory with these 
distinctively quantum properties will satisfy the three information-theoretic constraints. 

How much light does this result shed on the nature or origin of quantum mechanics? 
Clifton, Bub and Halvorson suggest that 

The fact that one can characterize quantum theory... in terms of just a few 
simple information-theoretic principles. ..lends credence to the idea that an 
information-theoretic poi nt of view is the righ t perspective to adopt in rela- 
tion to quantum theory. ijClifton et aLLEoOa p. 4) 

Certainly, the CBH characterization theorem indicates that concentrating on some 
information-theoretic principles has proven fruitful in providing a novel axiomatisation 
of the theory, but is something more than this intended by the statement that 'an 
information-theoretic point of view is the right perspective to adopt'? In particular, 
does the CBH characterization shed light on how we should understand the quantum 
formalism more broadly? Above all, docs it have implications for the traditional inter- 
pretive que stions in quan t um m echanics; for the knotty problems of the meaning of the 



formalism? 



Clifton et al 



1 200.11) seem to suggest so: 



We... suggest substituting for the conceptually problematic mechanical per- 
spective on quantum theory an information-theoretic perspective. That is, 
we are suggesting that quantum theory be viewed, not as first and fore- 
most a mechanical theory of waves and particles... but as a theory a bout the 

possibilities and impossibilities of information transfer. l|Clifton et al ,200,^1. 

p.4) 



The thought is pursued further by 



Assuming the information-theoretic constraints are in fact satisfied in our 
world, no mechanical theory of quantum phenomena that includes an account 
of measurement interactions can be acceptable, and the appropriate aim of 
physics at the f undamental level becomes the representation and manipulation 
of information. iBubl l|2004l p. 242), my emphasis. 
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We shall return presently to the question of the interpretational implications of the CBH 
characterization. First, let us consider some points relating to the C*-algebraic starting 
point of the theorem. 



9.2.2 Some queries regarding the C*-algebraic starting point 



It is of course evident that any axiomatic characterization of a physical theory has to 
start from somewhere, and as mentioned above, Clifton, Bub and Halvorson suggest that 
adopting a C*-algebraic framework is an appropriately neutral starting point. However, 
some questions can be raised about the strength of this starting assumption. 

For some, the very fact that C*-algebras make use of a complex vector space, as 
opposed, say, to a real or quaternionic one, may already be to assume too niuch^. A 
second sort of worry is raised by the existence of various 'toy-theories' that satisfy the 
three information-theoretic constra ints of the CBH characterization theorem and yet are 



palpably not quantum mechanics l|SDekken£ 



mi 



Smolin . 



200,1^ . These toy theories 



are not counter-examples in the lo gical sense to the CBH theo rem, as they fail to satisfy 



the requirements of the theorem: 



Halvorson and Bn\i pOO,*^ argue that Smohn's toy 



theory exhibit s physical patholo gies as it violates an analogue of the C*-independence 



condition, and lHalvorsonI (|2ili]J) proves that Spekkens' toy-theory is not a C*-algebraic 
theory. But if, from the point of view of the CBH characterization, what distinguishes 
Spekkens' theory, which satisfies the three information-theoretic constraints, from quan- 
tum mechanics, is the fact that it is not a C*-algebraic theory, then this throws into stark 
relief the question of what the important physical, or information-theoretic, content of 
the initial C*-algebraic assumption is. 

We s hall consider two further questions however. The first is discussed in some 
depth in lHalvorsonI l|2004j) . but it bears re-emphasising. It concerns the role that can be 



attributed to the no-bit commitment axiom when one starts in the C*-algebraic setting. 

*Cf. iFuchj <200ll p.5), for example; this complaint is noted in lBubl <2004D : iHalvorsonl i2004) . 
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The role of no bit-commitment 

As we have noted, the intuitive role for the no bit-commitment axiom in the characteriza- 
tion theorem is to ensure that one arrives at theories which allow entangleme nt between 



separated systems. However, it is known l|Landaij , 



1987 



BacciagaluDpi , 



19941) that if the 



C*-algebras A and B associated with two distinct (spatially separated) systems are kine- 
matically independent and non-commutative, then it already follows automatically that 
there are entangled states for the joint system, in the C*-algebraic framework. That is, if 
we assume no-signalling and no-broadcasting, then entanglement follows automatically, 
and a further axiom is not required. But this seems to indicate that the formal structure 
of C* -algebras is not as neutral as one might suppose and is really doing a good deal of 
work in arriving at the distincti ve quantum feature s we are seeking to derive. 



This fact is already noted in 



Clifton et al 



1 200.11) . There the suggestion is made that 



the third axiom is required nonetheless, to ensure that the entangled states for spatially 
separated systems that arise are actually part of the physical state space, as opposed 
to being mere mathematical artefacts of the formalism. But this argument seems un- 
convincing. Whilst we are familiar with the idea that it may sometimes be necessary to 
place restrictions on the allowed states within a given state space (superselection rules 
and the like), the case we are now being asked to entertain is of a very different kind. 
It is not that we have a state space that we are restricting by adding a further clause — 
ruling certain states out — rather, we have a particular state space postulated, and are 
being asked to consider having to rule certain states in as physical. But ruling states 
in rather than out by axiom seems a funny game. Indeed, once we start thinking that 
some states may need to be ruled in by axiom then where would it all end? Perhaps we 
would ultimately need a separate axiom to rule in every state, and that can't be right. 
Thus the role that is supposed to be being played by the third axiom remains obscure. 

One might try to re-phrase the argument so as not to appeal to the objectionable idea 
of the axiom being required to 'rule states in'^. One might instead emphasize that the 
role of the no bit-commitment axiom is to rule out a certain class of theories — namely, 
Schrodinger-type theories — that would still be on the table otherwise. But we should be 

®Bub, personal communication. 



CHAPTER 9. SOME INFORMATION-THEORETIC APPROACHES 



207 



clear in what sense the Schrodinger-type theories are an option once one has postulated 
the first two information-theoretic axioms. We know that all (C*-algebraic) theories 
consistent with the first two axioms allow entangled states between space-like separated 
systems, thus a Schrodingcr-typc theory, which lacks such states, could only arise as the 
result of imposing further restrictions on allowed theories that cut the entangled states 
out^°. Thus a Schrodinger-type theory is only an option in the sense that we could 
arrive at such a theory by imposing further requirements to eliminate the entangled 
states that would otherwise occur naturally in the theory's state space. (Of course, such 
a theory would not be quantum mechanics, and in the light of the experimental violation 
of Bell inequalities, we know such a theory would not be empirically adequate, but that 
is by-the-by.) 

Having postulated the first two axioms, the pertinent question to ask is whether the 
desired class of theories has then been delimited. The answer, given the C*-algebraic 
setting, is indeed 'yes'- The fact that there may be other types of (perhaps rather 
gerrymandered) theory that could be reached by imposing further requirements of some 
sort would not seem to undermine this claim. We don't need to appeal to the no bit- 
commitment axiom to leave us only with quantum-type theories: all the theories before 
us (following the first two axioms) are of the desired type. 

The no bit-commitment condition does not seem, then, to play a genuine role in 
characterizing quantum theory in a C*-algebraic setting, but to figure more as a corol- 
lary: quantum theory may be characterized as a C*-algebraic theory that abjures both 
superluminal signalling by measurement and broadcasting; having thus reached our de- 
sired class of theory, it transpires that this desired class will also be one for which secure 
bit-commitment is not possible. Note, though, that a scenario could be imagined in 
which the no bit-commitment condition would play more of an active role. If, for some 
reason, we were unsure about whether a Schrodinger-type theory or a quantum theory 
were the correct physical theory, then being informed by an oracle whether or not un- 
conditionally secure bit- commitment was possible would be decisive: we would be saved 

-'^"N.B. a further option may be noted. It could be that the dynamics is such as to lead to decay 
of entanglement on spatial separation — but to consider this possibility is, strictly speaking, to go be- 
yond the remit of the CBH theorem which is intended to concern itself with the quantum mechanical 
kinematics. 
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the effort of having to go out into the world and perform Aspect experiments. But as 
this is not our position, the no bit-commitment axiom does not play an active role in 
picking out quantum theory. 

The position we have reached seems to be as follows. If one is attempting to provide 
a characterization of quantum mechanics in information-theoretic terms, it seems rea- 
sonable to des i re an information -theoretic explanation of the existence of entanglement 



I Clifton et al 



2(M 



20041) . Starting from the C*-algebraic setting of the CBH 
characterization theorem, however, entanglement just seems to spring automatically 
out of the mathematical machinery, when one would hope instead to be providing an 
information-theoretic explanation. We have seen, moreover, that the no bit-commitment 
condition is precluded from providing such an explanation in the context of the CBH 
theorem. How, the n, mi g ht on e proceed? 
One option, as 



Bud l|2004l p. 6) notes is to conjecture that in a weaker algebraic 
setting (e.g. Segal algebras) the existence of entangled states would not follow from 
the first two information-theoretic axioms, but would require the imposition of the no 
bit-commitment axiom in addition. On the other hand, however, it is also conceivable 
that the intuitive argument outlined above linking no bit-commitment to the existence 
of entanglement might simply be misleading. Perhaps, in the end, it may turn out not 
to be possible to cash out the intuitive argument formally. 

Another option, if one is after a proper information-theoretic explanation of the 
appearance of entanglement, would be to provide an information-theoretic reason for 
the initial choice of C* -algebras as the mathematical framework. Then the fact that 
entanglement emerges naturally in the framework would not be worrying. However, in 
this case, it is not immediately obvious that one should expect such a reason to be based 
on the possibility of bit-commitment. 

Additivity of expectation values 

There is another reason why it may be suspected that adopting a C*-algebraic approach 
is perhaps an overly restrictive starting point; why the framework may not be quite so 
neutral as it first appears. This concern centres on the nature of states in C*-algebraic 
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theories. 

Ever since Bell's influential criticism of von Neumann's no-hidden variables theorem 



Ml mm 



von NeumannI IjlflSil pp. 305-324)) it has been widely appreciated that it 
is an extremely strong assumption to adopt a requirement of additivity of expectation 
values for observable quantities. Vide Bell: 

...the additivity of expectation values. ..is a quite pec uliar prope rty of quan- 
tum mechanical states, not to be expected a priori. llBd lll96fii §3) 

In particular, he goes on to note, when one is considering hidden variable theories: 

There is no reason to demand [expectation value additivity] individually of 
the hypothetical dispersion free states, whose function it is to reproduc e the 
measureable peculiarities of quantum mechanics when averaged over. ijBell 

Em §3) 

These familiar observations are relevant to our concerns because the C*-algebraic 
notion of state makes precisely this assumption: states are linear functionals of observ- 
ables. In what follows I shall seek to elaborate this concern by adapting the methodology 
of Valentini. 

It is well known that in many ways, the Bohm theory is characteristic of what a 
hidden variable theory for quantum mechanics must look like. We know, for example, 
that any acceptable hidden variable theory would have to be nonlocal and contextual; 
indeed it was the example of the Bohm theory that led Bell to pose the question of 
whether any hidden variable theory replicating the predictions of quantum mechanics 
would have to be nonlocal. Now the Bohm theory reproduces the empirical predictions 
of quantum mechanics iff the probability distribution P for particle positions is given by 
That P equals jl-p is an additional assumption in the standard Bohm theory; the 
(primary) role of the wavefunction as a guiding field is logically ind ependent of its role 



Bohml l|1952j) therefore explicitly 



in determining the distribution for particle position, 
countenanced the possibility that situations could arise in which P would differ from 
l^tp and thus empirical predictions would be expected that differ from those of quantum 
theory; in particular, violation of the position-momentum uncertainty principle becomes 
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possible^^. However, he also went on to suggest that an argument could be given that 
the distribution P can be expected to tend to I ^P as a k i nd of e quilibrium distribution. 



Valentinil Ijl991al) who showed that the 



This thought was developed in detail by 
relation P = j'l'p can indeed be derived as the 'quantum equilibrium' distribution to- 
wards which systems will tend, as the result of a 'subquantum //-theorem'. He also 
demonstrated that signal-locality (the impossibility of supcrluminal signalling via mea- 
surement) and the un certainty principle hold in general only in the equilibrium state. 



i.e., only if P = l|Valentini . 



19911J) . Thus the features of signal-locality and un- 
certainty can be understood to arise as effective features of an underlying nonlocal and 
deterministic theory, a pleasing result if one is exercised by the apparently conspirato- 
rial fact that quantum mechanics (on many interpretations) gives rise to nonlocality, 
but only of a carefully restricted kind ('passion-at-a-distance' ?) that does not permit 
signalling and hen ce avoids e xplicit conflict with relativity. 



More recently, 



Valentini 



1 2002q) has shown that the role of the Bohm theory as a 
stereotype hidden variables theory extends further: it can be shown that for any deter- 
ministic hidden variable theory, signal-locality will hold in general only in equilibrium. 
These facts are pertinent to our discussion of the axiomatic derivation of quantum me- 
chanics from information-theoretic principles, as many of the principles appealed to will 
be, from the perspective of a deterministic hidden variable the ory, merely conting ent 



Valentini 



1 2002af j) to 



and accidental features of the equilibrium state. This factor leads 
discuss the possibility of 'sub-quantum' information processing that would be possible 
using non-eq uilibrium matter (perhaps matter left over from early stages of the life of 



the universe ijValentini , 



20011)). 

In particular, out of equilibrium, instantaneous signalling would be possible, thus con- 
flicting with the flrst of the three information-theoretic conditions of the CBH theorem; 

I — "IF! 

and it would also become possible to distinguish non-orthogonal states l|Valentin]l . 12002a. 
§5), leading to a violation of no-cloning and hence conflict with the no-broadcasting con- 
straint of CBH. 



^'^'if the theory is generalized... The probability density of particles will cease to equal Thus 
experiments would become conceivable that distinguish between IvI/p and this probability; and in this 
way we could obtain an experimental proof tha t the normal interpretation, which gives l^l/p only a 
probability interpretation, must be inadequate.' iBohml |l952l I §9) 
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Now we may adopt Valentini's framework of deterministic hidden variables theories 
which admit of an equihbrium distribution that ensures empirical agreement with stan- 
dard quantum mechanics, along with non-equilibrium distributions that in general lead 
to violations of the quantum predictions, in order to elucidate the sense in which the 
assumption of linearity associated with the C*-algebraic notion of state may be seen as 
problematic. In brief, the assumption of linearity, hence additivity of expectation values, 
rules out by fiat the possibility of non-equilibrium deterministic hidden variables theo- 
ries. That is, one can show that additivity of expectation values can be expected to hold 
only in equilibrium for such hidden variable theories. Thus, by taking C*-algebras as our 
theoretical starting point, we are immediately ruling out the possibility of deterministic 
hidden variables theories in the general case. But this is a big assumption. 

The relevant result is a straightforward generalisation of Bell's argument contra von 
Neumann^-^. We will consider schematic hidden variables theories of the following sort. 



Assume (following iBelll l|1966 , 



1982)) ^ function / which determines the value of the 
outcome of an experiment measuring the quantum mechanical observable A, for an 
initial hidden variable A and quantum state {ip). So / is a function /(A, \'ip),A) whose 
range is the set of eigenvalues of A^^. The expectation value of the observable A will 
then be given in the usual way by averaging over the space A of hidden variables: 

{A) = [ dAP(A)/(A,|^),^), (9.1) 



where P(A) is the probability distribution for the hidden variables A. (This distribution 
may also depend on the quantum state lip).) Ex hypothesi there exists an equilibrium 
distribution Pcq(A) for which eqn. 1)9. 1() will return the quantum expectation values. 

Now we know that the function / will not be linear in the observable argument A, 
as the outcome of the measurement has to be one of the eigenvalues of the operator in 
question, and the eigenvalues of linearly related operators are not themselves linearly 

bee also lValentinil [200a) for a closely related discussion. 

Clearly, the mapping / will in general also depend on the way in which the observable in question 
is measured (in order to avoid the sorts of problem made famous by Kochen-Specker). For example, 
in the Bohm theory, the mapping from the initial value of the hidden varia ble to determinate outcome 
depends on the measurement Hamiltonian. (Compare also IValentinil i2n0.'j . p. 6).) 
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related (cf. mI Il96(lll982() . The requ rement of additivity of expectation values is that 



{A + B)^{A) + {B); 



in our deterministic hidden variable context this will become: 



dXP{X)f{X, \i;),A + B)^ / dXP{X) [/(A, |V) , A) + /(A, | V) , B)] . (9.2) 



Now we know this equation holds for the equilibrium distribution Peq(A): it has to for 
empirical adequacy; but it can hold for arbitrary P{X) only if 

/(A, \^),A + B)^ /(A, , A) + fix, IV) , B), 

that is, only if / is linear in the observable argument. But we know it isn't, hence 
expectation values won't be additive for general distributions for the hidden variables. 

Thus we see that the assumptions involved in the C* -algebraic notion of state are 
arguably overly strong when seeking to provide an axiomatic characterization or deriva- 
tion of quantum mechanics. A large and potentially interesting class of theories is being 
ruled out by assumption. The requirement of expectation value additivity will not hold 
in general for a non-equilibrium deterministic hidden variable theory. Even if one is not 
particularly enamoured of hidden variables, this nonetheless serves as a vivid illustration 
of the fact that the assumption of states as linear functionals is a non-trivial one. 

Having presented this argument, however, it is important to note that there is a 
danger of a certain degree of failure of communication between a proponent of the 
argument and advocates of C*-algebras as a comprehensive framework for describing 
physical theories. For, the latter will argue, there is surely no problem; the definite 
particle trajectories of the Bohm theory, for example, can happily be incorporated into 
the C*-algebraic framework: the algebra of observables for the Bohm the orv. in fact 



BuiJ 



2m, 



will be the commutative algebra generated by the position observable (cf. 
pp.257-8). 

The source of the trouble is a possible equivocation over what is meant by 'observ- 
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ables' by the two parties. In the argument that I have presented, 'an observable' refers 
to a quantum mechanical observable; in concrete terms, to those quantities measured 
in the standard ways by quantum physicists in the lab. By contrast, when it is said 
that Bohm trajectories may be described in the terms of a C*-algebraic theory, it is not 
these observables which are the observables of the theory, hence my argument does not 
get a grip; but equally, the theory in question docs not then (in general) assign values 
to the outcomes of the experiments we might expect to be interested in — those being 
performed by quantum physicists. 

If one is interested in a theory which assigns values to the outcomes of measurements 
that are performed by quantum physicists, i.e., to measurements of observables with 
the quantum structure (and such theories, of course, have a prominent history in dis- 
cussion of the foundations of quantum mechanics), then the argument given above will 
apply; in the general case, expectation value additivity will not hold. Even if one is un- 
moved though and remains persuaded of the generality of the C* -algebraic framework 
for all cases of interest, the argument described here remains important. It provides 
another example, to add to those already provided by Valentini, of where the assump- 
tions involved in the CBH characterization theorem depend, from the point of view of a 
deterministic hidden variable theory, on a special feature of quantum equilibrium; that 
is, on contingent and accidental matters of fact that will not obtain in general. 



9.2.3 Questions of Interpretation 

Perhaps the most intriguing question from the philosophical point of view is whether, or 
to what extent, the CBH characterization theorem has implication s for the fami l iar in- 
terpretational questions of quantum mechanics. As we have noted, 



Clifton et al 



(2003) 



do seem to suggest that some implications of this nature are forthcoming. On reflection, 
however, this suggestion may appear somewhat surprising: the aim of their enterprise, 
after all, was to provide an axiomatic derivation of the mathematical structure of quan- 
tum theory; yet we are all too aware that this structure may be subject to interpretation 
in very many different ways (we saw, for example, an incomplete selection of views in 
Section l3.5|) . One would think that to provide an axiomatic characterization of a par- 
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ticular mathematical structure is to do just that and no more. Surely, when faced with 
the same old structure before us once again, the standard range of interpretations will 
be as applicable as ever ? 



Clifton et al 



l2i}Q2|) suggest, though, that their theorem intimates that quantum 
mechanics may b e seen as a principle theory and it is in this sense that an interpreta- 



tion is provided. iBubl l)2004l) adopts a rather different tack. I shall maintain against 
these arguments that the rather negative line of assessment just mooted regarding the 
interpretational implications of the CBH theorem is nevertheless on track. 



Quantum mechanics as a principle theory? 

The distinction between principle and constructive theories is familiar from Einstein's 
discussions of his 1905 methodology in arriving at the correct form of relativistic kine- 
matics^'^. The paradigm example of a principle theory is thermodynamics, which is to 
be contrasted with a constructive theory such as the kinetic theory of gases. While 
constructive theories seek to 'build up a picture of the more complex phenom ena out of 



the m aterials of a relatively simple formal scheme from which they start out' IjEinstein . 



1919|) . principle theories proceed from the basis of some well grounded phenomenologi- 
cal principles that are found to govern a class of physical processes of interest (e.g., the 
non-existence of perpetual motion machines of the first and second kind, in the case of 
thermodynamics), in order to derive constraints that all instances of such processes have 
to satisfy. 



19791 p.49ff.) Einstein turned to 



As recounted in his Autobiographical Notes (ISchilpd . 
the methodological example of thermodynamics as a faute de mieux, given the confused 
state of knowledge in electrodynamics and mechanics at the turn of the 20th Century: 



Gradually I despaired of the possibility of discovering the true laws [of elec- 
trodynamics and mechanics] by means of constructive efforts based on the 
known facts. The longer and more desperately I tried, the more I came to the 
conviction that only the discovery of a universal formal principle could lead 



His most detailed present ation of the distinction is to be found in lEinsteid iWl^ . bee 
iBrown and Poole^^ l200ll l2004f) for recent discussions of the principle/constructive distinction in rela- 
tivity; in particular for their emphasis that — as recognised by Ei nstein — principle theories lose out to 
constructive theories in terms of explanatory power. As they note iBrown and Po olcv. 2001), while the 
distinction between principle and constructive theories is not absolute, it is nonetheless enlightening. 
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us to assured results. The example I saw before me was thermodynamics. 
(ISchilDuLllQTfll p.49) 

The Principle of Relativity and the Light Postulate became, of course, the principles that 
Einstein fixed upon; and these allowed him to derive the correct form of the co-ordinate 
transfor mations between ine rtial frames^^. 

Now lClifton et al.l l|2003l) suggest that their theorem shows that quantum mechanics 
may be understood as a principle theory — where the relevant principles are information- 
theoretic — and that in this sense an interpretation of quantum mechanics is provided. 
One has arrived at a description of the conditions {viz., the obtaining of the three 
information-theoretic constraints) under which quantum theory will be true. To il- 
luminate this sense of interpretation, they present an illustrative fable in which one 
imagines that relativity had originally been formulated geometrically by Minkowski as 
an algorithm for relativistic kinematics, and then Einstein came along and provided 
an interpretation of this algorithm by presenting his principle theory derivation of the 
Lorentz transformations. Similarly, the analogy goes, we have quantum mechanics as 
an algorithm for predicting the results of various experiments; and this algorithm now 
finds an interpretation in terms of the three information-theoretic constraints. We now 
understand how the world is organised so that quantum theory has to be true (or so the 
claim). 

However, it may be doubted whether this approach provides us with a particularly 
interesting sense of 'interpretation'. To pursue the analogy with relativity: Einstein 
showed us why the co-ordinate transformations between inertial frames had to be the 
Lorentz transformations — if they were not then one or more of the principles (or the 
symmetry assumptions) in his derivation would have to be false. But this explanation, 
or interpretation, remains silent on a very important point. The fact that the Lorentz 
transformations are the correct transformations between inertial frames encodes a great 
deal of detail about the dynamical behaviour of (ideal) rods and clocks — these are, 
after all, complex material bodies. Arguably, the fact that the speed of light, say, is 

^^Note, however, that it would be a mistake to construe special relativity purely as a principle theory. 
Einstein was later to refer to t he 's in' of treating rods and cloc ks as unanaly sed bodies, as opposed to 
'moving atomic configurations' f Schilpp , 1979, p p. 55-7); see also^ ^Ailj llflSl], P-14) in this regard. This 
point is elaborated in detail in lBrowri tl993i' ): iBrown and Pooie\n20Mn2004) . 
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measured to be the same in all inertial frames is ultimately to be explained in terms 
of the dynan i ical behaviour of rods and clock s — a constructive style of explanation (cf. 



Brown , 



Brown and Poolev 



2m, 



200# . In any event, it is clear that if appeal to 



the principles of relativity is providing an interpretation of the formulae of relativistic 
kinematics, it is an interpretation that glosses over a lot: there is a good deal more to 
be said about the conditions under which the Lorentz transformations constitute the 
correct transformations between inertial frames. 

Analogously, in the case of quantum mechanics, given the three information-theoretic 
constraints, the CBH theorem provides us with an explanation of why the states and 
observables in our theory have to take their characteristic quantum structure: if they 
did not, at least one of the assumptions would be false. But nothing is said about how 
the world should be understood if states and observables take on this form. 

By assumption, the world is such that the information-theoretic constraints are true, 
but this is too general and it says too little: it is consistent with a wide range of ways 
of understanding the quantum formalism. 

To elaborate: If one were to adopt the proposal under discussion, that quantum 
mechanics should be seen as a principle theory, then the objects of the theory whose 
behaviour the principles constrain are preparation devices and measuring apparatuses, 
considered as unanalysed black boxes. (Recall the association of states with prepara- 
tion devices and observables with measuring apparatuses in the C*-algebraic setting, 
discussed earlier). From the information-theoretic principles, the general sorts of rela- 
tions that should obtain between various preparations and measurements (and sequences 
of measurements) are derived. These principles are thought to provide an explana- 
tion (in some form) of why preparation devices and measuring apparatuses display the 
relations — in terms of observed relative frequencies of various experimental outcomes — 
that they do. Note that in saying this we are supposing what might be called a basic 
level of interpretation of our theory: we have related elements of the formalism (states, 
observables) with physical quantities (the statistical frequencies with which various out- 
comes of experiments may be expected). The main difficulty for the principle theory 
approach, construed as providing a putative interpretation of quantum theory, is that it 
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doesn't involve anything going beyond this most elementary level of interpretation. 

However, typically when one is in concerned with the interpretation of a theory, and 
in particular, with the interpretation of quantum theory, one is interested in the further 
question of how these reports posed in terms of experimental results are to be under- 
stood. Are they merely reports of brute regularities, for example — an instrumentalist 
view — or is something more realistic appropriate? Do measurements reveal pre-existing 
values, or contextually determined outcomes, or are they to be understood in some other 
way? And so on. This is the traditional battle-ground of interpretive questions in quan- 
tum theory; and something needs to be said at this level, even if it is the bare claim that 
there is no more to be said (instrumentalism)^^. But the principle theory approach, as 
it only engages with the statistical relations between preparation devices and measuring 
apparatuses, says nothing. 

Of course, various different approaches might be taken to specifying what is involved 
in the in t erpret a tion of a theory . The route I have adopted here is close to that of 



Redhea dl jlQSvi . LedheaJ 



RedheadI l|1987l Chpt. 2) distinguishes two senses of interpretation of 
a theory. To provide an interpretation in the first sense is to supply rules which corre- 
late elements of the mathematics of a theory with physical quantities. In this bracket, 
for example, is what he terms the minimal instrumentalist interpretation of quantmn 
mechanics: the familiar rules that tell us what the possible results of measurements are 
in quantum mechanics and how the statistical frequencies may be calculated with which 
these measurement results will turn up when a measurement is repeated very many 
times on systems prepared in the same way. 

An interpretation in the second sense, he says, is: 

...some account of the nature of the external world and/or our epistemological 
relation to it that serves to explain how it is that the statistical regularities 
predicted by the formalism with the minim al instrumentalist interpretation 
come out the way they do. llR,edheadLll987i p.44) 

He goes on to note that we might simply accept the statistical regularities as brute facts, 
which is to take the instrumentalist view (theories in physics just are instruments for 

^''This recalls the earlier discussion of Bohr's semantic ascent fSection l9.1.1l : ascent notwithstanding, 
something still had to be said about about how claims made using the terms of the theory are to be 
understood. 
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expressing regularities between observations); but this is certainly to take a stance on 
interpretation in sense two^^. 

Now, the sense of 'interpretation' associated with the principle theory approach is 
this: an argument is given for why we have one theory (which is already interpreted 
in Redhead's sense 1) rather than another; why the states and observables take one 
form, rather than another. But to repeat, this doesn't tell us anything about how the 
theory thus chosen should be interpreted in sense 2. It is only a minimal instrumentalist 
interpretation (in sense 1) linking the formalism to empirical predictions that is ever 
involved. In the thin sense in which an interpretation might be forthcoming from the 
principle theory approach, it is not a sense of interpretation that engages with the 
traditional problems of the meaning of the quantum formalism: with the question of 
how this familiar formalism is to be understood. Since the result of the CBH theorem 
is to recover the standard structure of quantum theory, the usual ranges of interpretive 
options will be open to us; and indeed one of these opti ons must be t aken^ even if 



Clifton et al 



(2003). Thus, 



one adopts the principle theory viewpoint as advocated by 
far from the CBH theorem motivating a principle theory viewpoint ('an information- 
theoretic perspective') that ameliorates the conceptual puzzles of quantum mechanics, 
we see that it simply fails to engage with these questions. 

Bub's 2004 argument: a problem of underdetermination 

More recently. 



Bubl (|2004ri has adopted a rather different line of attack. He argues that 
in light of the CBH theorem we are not in fact free to adopt the full range of (sense 
2) interpretations of the quantum formalism. Assuming that the information-theoretic 
constraints are satisfied in our world, he insists, no mechanical theory of quantum phe- 
nomena that includes an account of measurement interactions can be acceptable. Such 
accounts will face, in his view, a problem of in-principle underdetermination which ren- 
ders them unacceptable: 



'Indeed we shall often refer to the formalism of QM plus the minimal instrumentalist interpretation 
in the first sense as the minimal instrumentalist interpretation in the second sense.' tR.edh ead . 198^ 
p. 44) Redhead's minimal instrumentalist interpretation (sense 2) is what I earlier termed a statistical 
interpretation. 
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...a mechanical theory that purports to solve the measurement problem is 
not acceptable if it can be shown that, in princip le, the theo ry can have no 
excess empirical content over a quantum theory. ijBubllioO^ p. 261) 

We need to examine how this problem of underdetermination is thought to arise, 
but first it will be useful to have a rough statement of how the different styles of in- 
terpretation one might be interested in are to be divided up. For the purposes of this 
discussion, then, let us distinguish between those interpretations (in sense 2) that in- 
volve adding extra structure to the bare formalism to ensure a definite measurement 
outcome (this group would include the Bohm theory, hidden variables theories an d the 



sorts of modal interpretation picked out by the Bub-Clifton uniqueness theorem IjBub , 
19971 Chpt. 4)); those interpretations that appeal to a non-unitary dynamics (i.e., dy- 
namical collapse theories a la GRW); and those that stick as closely as possible to the 
bare quantum formalism (e.g., instrumentalist views and modern versions of the Everett 
interpretation^'* ) . 

It is the first group. Bub suggests, that will suffer from in-principlc underdetermi- 
nation, in light of the CBH theorem; while GRW approaches may conflict with the 
exact obtaining of the no bit-commitment axiom and are to be ruled out on that g round 



spon taneous collapse might interfere with some efforts to cheat in bit-commitment IjBub , 



20041 p. 256)). Let us now see how the underdetermination argument is supposed to run. 

It is essential to recognise that the argument has two components. The first is the 
claim that follows from the CBH theorem, that if the information-theoretic constraints 
are satisfied in this world, then the e mpirical result s we o btain will be those modelled 



by a quantum theory in the sense of 



Clifton et al 



1 2003|) (i.e., a theory with a non- 



commuting algebra of observables for individual systems, kinematic independence for 
distinct systems and entangled states across space- like separated systems). The second 
part of the argument is the assertion that the information theoretic constraints do hold 
in our world, both exactly (with no exceptions) and as a matter of law. 

^**Bub in fact appears to lump the Everett interpretation in with 'extra structure' interpretations. 
While this may be appropriate for some attempts to cash out Everett's ideas, it is not for the more 
satisfactory (for this very reason!) modern versions of Everett, as formulated by Saunders, Wallace and 
company (see refs. in Section 13.51 . This point is important for the conclusions that can be drawn from 
Bub's argument, as we will see below. 
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Now consider an 'extra structure' interpretation, such as the Bohm theory. Bub 
views this as an extension of a quantum theory that seeks to describe the mechanics 
underlying the statistics of a C*-algebraic quantum theory. However, if the information- 
theoretic constraints arc to hold, then the empirical predictions of the Bohm theory, or 
any other such extension ('extra structure interpretation') must be just the same as the 
quantum theory. But now, if the information-theoretic constraints are both law-like 
and hold exactly, then in any physically possible world, the empirical predictions of 
such an extension will be just the same as those of the bare quantum theory. In other 
words, it is physically impossible that there could be any evidence that would favour one 
such extension over another: there is in-principle underdetermination. Accordingly, the 
claim is, we should reject all such extensions'^. It is for this reason that extra structure 
interpretations are not acceptable, for Bub. 

This argument fails, however. It has no dialectical power against extra-structure 
interpretations as it involves a petitio principii. The crucial assumption, that the 
information-theoretic constraints are both law-like and hold exactly, is denied in the 
extra structure interpretations (at least in the case of the Bohm theory and hidden vari- 
ables theories). We have seen how, in the case of the Bohm theory and deterministic 
hidden variables theories, the information-theoretic constraints and even the assump- 
tion of expectation value additivity hold, if they hold at all, merely as contingent and 
accidental (non law-like) matters of fact. Prom the point of view of these theories, the 
constraints certainly don't hold in all physically possible worlds, and they might not 
even hold under all conditions in this world. Similarly, the argument against the GRW- 
typc theories is also a petitio; while the information-theoretic constraints arc law- like 
in this case, they don't always hold exactly: there may sometimes be a violation of no 
bit-commitment. But one does not provide an argument against a position by simply 
insisting on an assumption that is inconsistent with it. 

In all this, it is important to recognise that we only have reason to believe that 
the information-theoretic conditions obtain in the quantum context as they are conse- 

-"^^Bub emphasizes that the cpistemological principle at work here is not the — implausible — claim that 
it is never rational to adopt one theory over an empirically equivalent rival, but the far weaker claim that 
if there could never, in any physically possible world, be evidence favouring one theory over another, 
then it would not be rational to believe either. 
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quences of the standard quantum formalism. The empirical evidence we have for them 
derives second-hand from the empirical evidence for quantum theory. The evidence for 
quantum theory doesn't settle the question of how the formalism is to be interpreted 
(if it did one wouldn't need to try to detour via the CBH theorem!), so the empirical 
evidence we have is consistent with various different views on what the status of the 
information-theoretic conditions should be. From the point of view of an 'extra struc- 
ture' interpretation such as the Bohm theory, they will, as we have said, be seen as 
contingent and accidental features that obtain in some conditions; from points of view 
that stick closely to the quantum formalism (instrumentalism, Everett), they will be un- 
derstood as lawlike and exact. But if the status of the information-theoretic constraints 
is explicitly an interpretation dependent question, we may not appeal to an argument 
that essentially involves a controversial assumption about their status, in order to rule 
out certain forms of interpretation. 

Towards the end of his 2004 paper. Bub remarks that if one has succeeded in ruling 
out dynamical collapse theories and those interpretations that involve extra structure 
then 

It follows that our measuring instruments ultimately remain black boxes at 
some level that we represent in the theory simply as probabilisti c sources of 
ranges of labelled events[...]i.e., effectively as sources of sig nals... llBubll200i 
p. 261) original emphasis 

Furthermore, he suggests: 

...this amounts to treating a quantum theory as a theory about the rep- 
resentation and manipulation of information... [A] consequence of rejecting 
Bohm- type hidden variable theories or other 'no collapse' theo r ies is that we 
recognise information as a new sort of physical entity.... l)Bubl . 1200^ p. 262) 

Regarding the first point, it is pertinent to note that if one accepts my broad three- 
way carving up of the different interpretational options, then even if one has somehow 
managed to rule out the first two sets of possibilities (extra structure and dynamical 
collapse), then this still leaves us with at least two options in the third category, that is, 
with some form of instrumentalism, or an Everettian approach. Now while instrumental- 
ism may well be appropriately described in the terms Bub uses — measuring apparatuses 
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that must remain as unanalysed black boxes — this characterization is by no means apt 
for the Everett interpretation. Here measurement is perfectly well analysable, as one 
particular sort of dynamical interaction amongst many, set within a realist view of the 
universal quantum state. 

On the second point, even if one has placed Everettian views to one side, it remains 
obscure in what sense quantum theory would have become a theory about the represen- 
tation and manipulation of information, if this is supposed to be more than a new way 
of describing an old instrumentalist view. There is a simple difficulty, for instance, with 
trying to cash this idea out by suggesting that a measuring apparatus can be seen as a 
source of signals. If one has a signal, then it is intelligible to ask what the signals signify, 
or indicate (whether naturally or as a matter of convention). But what is a particular 
measurement outcome a signal of? It would seem that the only thing that could be 
signified would be something about pre-existing hidden variables; and this, presumably, 
is not what is desired at alP". As for the inference to information as a new sort of 
physical entity, it was, of course, a large part of the trajectory of argument in Part 1 
to make such a conception appear implausible, even downright mistaken. In combative 
mood, one might insist that to give an otherwise instrumentalist view of quantum me- 
chanics a subject matter does not seem a sufficient reason to conclude that information, 
or quantum information, is an entity. 



"It is a quite diflfcrcnt matter, of course, to consider a measuring apparatus as an information 
source in the sense of information theory, for then one is considering compressing and transmitting the 
output of the source, while the physical constitution of the source itself is wholly irrelevant (but for 
this very reason, one will not find any implications for quantum ontology here). From the point of 
view of information theory, the outputs of an information source signify nothing and have no meaning, 
conventional or otherwise, but no more, then, arc they, strictly speaking, signals. They are elements 
which have no semantic, nor even syntactic significance. This is just to repeat the familiar line that 
'information' in the technical sense is not a semantic notion. If something is a source of signals then 
one might well be interested in applying communication theory to it and modelling it as an information 
source in the sense of that theory. But you don't make something a source of signals by considering it 
as an information source. 
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It is clear that there is a good deal more to be said in the tale of the significance of 
quantum information theory for the meaning of the quantum formalism; a tale which, 
to a considerable degree, is still being written. 

In particular, considerations of time and space have precluded any discussion of what 



is perhaps the most radica l proposal advanced so far: the quantum Bayesianism o f Caves 

2nn2bl iFuchs 



Fuchs and Schack (iFuchf 



Fiichs and Schack. 



2m 



Caves et al 



2nn2a^ 



Caves et al. 



Let me, then, by way of a send-off, provide the briefest sketch 
of this position, locating it with respect to the discussion of the previous two chapters. 
(I concentrate on the position as advocated by Fuchs.) 

The quantum Bayesian approach is characterized by its non-realist view of the quan- 
tum state: the quantum state ascribed to an individual system is understood to rep- 
resent a compact summary of an agent's degrees of belief about what the results of 
measurement interventions on a system will be. The probability ascriptions arising from 
a particular state are understood in a purely subjective, Bayesian manner. Then, just 
as with a subjective Bayesian view of probability there is no right or wrong about what 
the probability of an event is, with the quantum Bayesian view of the state, there is no 
right or wrong about what the quantum state assigned to a system is^^. The approach 
thus figures as the terminus of the tradition which has sought to tie the quantum state 
to cognitive states, but now the cognitive state invoked is belief, not knowledge, and, 
crucially, the problems raised by factivity are thus avoided. 

Importantly, however, this non-realist view of the quantum state is not the end point 

^'^The fact that scientists in the lab tend to agree about what states should be assigned to systems 
is then explained by providing a subjective 'surrogate' for objectivity, along the lines that de Finetti 
provided for subjective probability: an explanation why different agents' de grees of beliefs may be 
expected to come into alignment given enough data, in suitable circumstances iCaves et al.L l2002bfl . 
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of the proposal, but merely its starting point. (The aim, then is for more than a new 
formulation of instrumentalism.) The hope expressed is that when the correct view is 
taken of certain elements of the quantum formalism {viz. quantum states and operations) 
it will be possible to 'see through' the quantum formalism to the real ontological lessons it 
is trying to teach us^^. Given the point of departure of a Bayesian view of the state, and 
using techniques from quantum information, the aim is to winnow the objective elements 
of quantum theory (reflecting physical facts about the world) from the subjective (to do 
with our reasoning) . Ultimately, the hope is to show that the mathematical structure of 
quantum mechanics is largely forced on us, by demonstrating that it represents the only, 
or, perhaps, simply the most natural, framework in which intersubjective agreement 
and empirical success can be achieved given the basic fact (much emphasized in the 
Copenhagen tradition) that in the quantum domain, it seems that the ideal of a detached 
observer may not be obtained. 

One of the main attractions of this approach, therefore, is that it aims to fill-in an 
important lacuna associated with many views in the Copenhagen tradition: It is all 
very well, after all, adopting some non-realist view of the quantum formalism, but, one 
may ask, why is it that our best theory of the very small takes such a form that it 
needs to be interpreted in this manner? Why are we forced to a theory that does not 
have a straightforward realist interpretation? Why is this the best we can do? The 
programme of Caves, Fuchs and Schack sets out its stall to make progress with these 
questions, hoping to arrive at some simple physical statements which capture what it is 
about that world that forces us to a theory with the structure of quantum mechanics. 
Although the aim is to seek a transparent conceptual basis for quantum mechanics, 
there is no claim that the theory should be understood as a principle theory. In further 
contrast to the CBH approach, rather than seeking to provide an axiomatisation of the 
quantum formalism which might be interpreted in various ways, the idea instead is to 

^■^'[0]ne... might say of quantum theory, that in those cases where it is not just Bayesian probabiUty 
theory full stop, it is a theory of stimulation and response (Fuchs, 2002b, 2003). The agent, through 
the process of quantum measurement stimulates the world external to himself. The world, in return, 
stimulates a response in the agent that is quantified by a change in his beliefs — i.e., by a change 
from a prior to a posterior quantum state. Somewhere in the structure of those belief changes lies 
quantum theory's most d irect statement about what we believe of the world as it is without agents.' 
<Fuchs and Schacld. |2003) 
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take one particular interpretive stance and see whether this leads us to a perspicuous 
axiomatisation. 



Now Fuchs' direct arguments for th e non- 



we may note, logically compelling (e.g. 



objective view of the quantum state are not, 



Fuchi 



2002ai §3); they are plausibility arguments 



based on the oddity of nonlocality in the EPR scenario; and those of a more realist bent 
might simply accept the nonlocality associated with collapse or hidden variables, or move 
to a realist view such as Everett that avoids the problem. But this is no real objection 
to the approach. The quantum Bayesian view is presented as a research programme: 
when this view of the quantum state and the quantum formalism is adopted, where does 
it take us? The proof of the pudding, ultimately, will be in the eating. Meanwhile, the 
approach is to be applauded for providing a consistent way, perhaps the only consistent 
way, of fruitfully developing the old line of thought that links the quantum state to 
information. But, finally, it might turn out to be that in the end, taking the Bayesian 
route does cause us to give up too much of what one needs as objective in quantum 
theory. These matters deserve further discussion. 
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